Random Numbers and Vectors from Multivariate Normal Distributions
This example shows how to generate and visualize random numbers and vectors that are drawn from multivariate normal distributions. This example discusses the steps to generate and visualize random numbers and vectors that are drawn from univariate, bivariate, and trivariate normal distributions. The randn
function generates random numbers from the standard normal distribution. To generate random numbers and vectors from a multivariate normal distribution with a specific mean and covariance, you can transform the data generated from the standard normal distribution. Starting from a -dimensional random variable that follows a normal distribution with zero mean and a unit covariance matrix, you can transform to . The variable follows the normal distribution with mean and covariance matrix . The covariance matrix is a symmetric and positive definite (nonsingular) matrix, which indicates that the multivariate normal distribution is also nondegenerate. Therefore, you can perform Cholesky decomposition on to obtain the upper triangular matrix . A multivariate normal distribution that is degenerate, where is not full rank (or singular), is beyond the scope of this example.
Univariate Normal Distribution
In statistics, a normal distribution, or Gaussian distribution, is a type of continuous probability distribution for a real random variable. The univariate normal distribution of the random variable is a probability density function (pdf) that has the form
.
The parameter is the mean of the distribution, and is the standard deviation, with being the variance of the distribution.
Standard Normal Distribution
The simplest case of a normal distribution is the standard normal distribution, which has a mean of and a standard deviation of . In MATLAB®, you can use randn
to generate random numbers that follow the standard normal distribution. For example, generate 10,000 random numbers.
n = 10000; x_standard = randn(n,1);
Plot the probability density estimate of these numbers using histogram
.
figure histogram(x_standard,Normalization="pdf",EdgeColor="none") xlabel("x") ylabel("Probability Density") grid on hold on
To compare the sampled numbers with the pdf of the standard normal distribution, plot the pdf using fplot
.
f = @(x) (1/sqrt(2*pi))*exp(-1/2*x.^2); fplot(f,[-5 5],LineWidth=2)
Normal Distribution with Specified Mean and Standard Deviation
To generate random numbers from a univariate normal distribution with a specific mean and standard deviation , you can transform the data generated from the standard normal distribution by multiplying the data by and adding . For example, transform the previously generated random numbers so that they follow a normal distribution with a mean of and a standard deviation of .
mu = -2; sigma = 3; x_transformed = sigma*x_standard + mu;
Calculate the mean and the standard deviation of the transformed data. The results do not exactly match the specified mean and standard deviation because the calculation from the sampling of the distribution determines these results.
mu_data = mean(x_transformed)
mu_data = -1.9950
sigma_data = std(x_transformed)
sigma_data = 2.9744
Plot the probability density estimate and pdf of the transformed data.
histogram(x_transformed,Normalization="pdf",EdgeColor="none") g = @(x) (1/(sigma*sqrt(2*pi)))*exp(-1/2*((x-mu)/sigma).^2); fplot(g,[-15 15],LineWidth=2) hold off
Multivariate Normal Distribution
The multivariate normal distribution extends the univariate normal distribution to two or more variables. It has two parameters: a mean vector and a covariance matrix , which are analogous to the mean and variance parameters of a univariate normal distribution. In this example, the covariance matrix is restricted to being a symmetric and positive definite (nonsingular) matrix, indicating that the multivariate normal distribution is nondegenerate. The diagonal elements of contain the variances for each variable, and the off-diagonal elements of contain the covariances between variables.
The pdf of the -dimensional multivariate normal distribution is
where and are 1-by- row vectors, is a -by- symmetric, positive definite matrix, and is the determinant of .
Bivariate Normal Distribution with Zero Mean and Unit Covariance
In the two-dimensional case, the multivariate normal distribution becomes a bivariate normal distribution, which can also be written as
where the standard deviation of is , the standard deviation of is , and the correlation coefficient between the random variables and is . In this case, the mean vector is . The covariance matrix is .
If and , then the bivariate normal distribution is a circularly symmetric bell-shaped surface in the -plane that is centered at .
For the special case where and are independent random variables that follow the standard normal distribution, the mean vector of the bivariate distribution is 0 and the covariance matrix is a 2-by-2 identity matrix. To generate random vectors that follow this bivariate normal distribution, you can use randn
directly without any transformation. For example, generate 10,000 data points of 1-by-2 random vectors by specifying the array size when using randn
. Plot the 2-D probability density estimate of these bivariate random vectors using histogram2
. To compare the sampled vectors with the pdf of the bivariate normal distribution, plot the pdf using fmesh
.
u = randn(n,2); figure histogram2(u(:,1),u(:,2),Normalization="pdf",FaceAlpha=0.5,EdgeColor="none") xlabel("x") ylabel("y") zlabel("Probability Density") axis([-6 6 -6 6]) axis square view(10,30) hold on mu = [0 0]; sigma = [1 0; 0 1]; f = @(x,y) exp(-0.5*dot(([x(:),y(:)]-mu),(sigma\([x(:),y(:)]-mu)')',2)')/ ... sqrt((2*pi)^2*det(sigma)); fmesh(f,FaceAlpha=0,EdgeColor="cyan") hold off
Bivariate Normal Distribution with Specified Mean and Covariance Matrix
To generate random vectors from a general bivariate normal distribution with a specific mean and covariance, you need to transform the previously generated data. Starting from a -dimensional random variable that follows a normal distribution with zero mean and unit covariance matrix, you can transform to . The variable follows the normal distribution with mean and covariance matrix .
For example, specify the mean as . Specify the standard deviation of as , the standard deviation to as , and the correlation coefficient to . The covariance matrix becomes .
To interactively change the values of , , and , you can add sliders to your script. Go to the Insert tab, click the Control button, and select Slider. For more information, see Add Interactive Controls to a Live Script. Add three interactive sliders that set the values of , , and . By default, when these values change, the Live Editor only runs the code in the current section. Configure this behavior by right-clicking the sliders, selecting Configure Control, and setting Run in the Execution section to Current section to end. This ensures the code in the section containing the sliders and any following sections runs whenever the values of the sliders change.
mu = [0.5 1]; sigma_x = 1.4; sigma_y = 2; rho_xy = -0.7; sigma = [sigma_x^2, rho_xy*sigma_x*sigma_y; rho_xy*sigma_x*sigma_y, sigma_y^2 ];
Perform the Cholesky decomposition of the covariance matrix. The result is an upper triangular matrix such that . Scale the original data by this matrix and shift the scaled data to the specified mean.
R = chol(sigma); v = mu + u*R;
Plot the 2-D probability density estimate of the transformed data using histogram2
. To compare the sampled vectors with the pdf of the bivariate normal distribution, plot the pdf using fmesh
.
figure histogram2(v(:,1),v(:,2),Normalization="pdf",FaceAlpha=0.5,EdgeColor="none") xlabel("x") ylabel("y") zlabel("Probability Density") axis([-12 12 -12 12 0 0.45]) axis square view(330,25) hold on f = @(x,y) exp(-0.5*dot(([x(:),y(:)]-mu),(sigma\([x(:),y(:)]-mu)')',2)')/ ... sqrt((2*pi)^2*det(sigma)); fmesh(f,FaceAlpha=0,EdgeColor="cyan")
Marginal Distributions of Bivariate Normal Distribution
One property of a multivariate normal distribution is that the marginal distribution of any subset of the random variables is also a multivariate normal distribution corresponding to that subset. In other words, the marginal distribution of from the bivariate normal distribution is a univariate normal distribution with as its mean and as its standard deviation, which can be written as
.
The marginal distribution of from the bivariate normal distribution is a univariate normal distribution with as its mean and as its standard deviation. In the case of the bivariate normal distribution, changing the correlation coefficient does not affect the marginal distribution of or .
To show this property, you can plot the marginal distributions of the transformed data in the previous section. Plot the marginal distribution of by using histogram
on v(:,1)
. For visualization purposes, plot this distribution in the (,pdf
)-plane by using hgtransform
and applying rotation and translation transformations to bring the plot into the plane. Compare the marginal distribution of with the pdf of the univariate normal distribution using fplot
.
ax = gca; hgx = hgtransform(ax); hgx.Matrix = makehgtform("xrotate",pi/2,"translate",[0 0 -12]); histogram(v(:,1),Normalization="pdf",EdgeColor="none",Parent=hgx) f = @(x) exp(-0.5*(x-mu(1)).^2/sigma(1,1))/sqrt((2*pi)*sigma(1,1)); fplot(f,[-12 12],Parent=hgx)
Plot the marginal distribution of by using histogram
on v(:,2)
. Plot this distribution in the (,pdf
)-plane. Compare the marginal distribution of with the pdf of the univariate normal distribution using fplot
.
hgy = hgtransform(ax); hgy.Matrix = makehgtform("xrotate",pi/2,"yrotate",pi/2,"translate",[0 0 12]); histogram(v(:,2),Normalization="pdf",EdgeColor="none",Parent=hgy) f = @(y) exp(-0.5*(y-mu(2)).^2/sigma(2,2))/sqrt((2*pi)*sigma(2,2)); fplot(f,[-12 12],Parent=hgy) hold off;
Trivariate Normal Distribution
To generate random vectors from any nondegenerate multivariate normal distribution, you can follow the steps outlined in previous sections. For example, generate random vectors from a trivariate normal distribution with a specific mean and covariance matrix. Specify the mean as . Specify the covariance matrix as .
Because there are three random variables, generate 10,000 data points of 1-by-3 random vectors from the standard normal distribution by specifying the array size when using randn
. Apply the transformation so that the generated data follows the specified multivariate normal distribution.
u_3D = randn(n,3); mu = [1 1.5 -1.5]; sigma = [2 -1.5 0.5; -1.5 4 2; 0.5 2 3]; R = chol(sigma); v = mu + u_3D*R;
Because the data already consists of 3-D points, you cannot visualize the histogram of the probability density in 3-D space (you would need an additional axis to show the frequencies). Instead, you can plot the location of the 3-D points by using scatter3
.
figure scatter3(v(:,1),v(:,2),v(:,3),1,".") axis([-10 10 -10 10 -10 10]) axis square xlabel("x") ylabel("y") zlabel("z") grid on view(330,25)
You can also check if the generated data satisfies the property of the marginal distribution of a multivariate normal distribution. For example, the marginal distribution of is a bivariate normal distribution with the mean and the covariance matrix .
Plot the marginal distribution of by using histogram2
on v(:,1)
and v(:,3)
. Compare the marginal distribution of with the pdf of the bivariate normal distribution using fmesh
.
figure histogram2(v(:,1),v(:,3),Normalization="pdf",FaceAlpha=0.5,EdgeColor="none") xlabel("x") ylabel("z") zlabel("Probability Density") axis([-10 10 -10 10]) axis square view(10,30) hold on fxz = @(x,z) exp(-0.5*dot(([x(:),z(:)]-mu([1 3])),(sigma([1 3],[1 3])\ ... ([x(:),z(:)]-mu([1 3]))')',2)')/sqrt((2*pi)^2*det(sigma([1 3],[1 3]))); fmesh(fxz,FaceAlpha=0,EdgeColor="cyan") hold off
See Also
randn
| rng
| histogram
| histogram2
| fplot
| fsurf