Main Content

nanstd

(Not recommended) Standard deviation, ignoring NaN values

nanstd is not recommended. Use the MATLAB® function std instead. With the std function, you can specify whether to include or omit NaN values for the calculation. For more information, see Version History.

Description

y = nanstd(X) is the standard deviation std of X, computed after removing all NaN values.

  • If X is a vector, then nanstd(X) is the sample standard deviation of all the non-NaN elements of X.

  • If X is a matrix, then nanstd(X) is a row vector of column sample standard deviations, computed after removing NaN values.

  • If X is a multidimensional array, then nanstd operates along the first nonsingleton dimension of X. The size of this dimension becomes 1 while the sizes of all other dimensions remain the same. nanstd removes all NaN values.

  • By default, nanstd normalizes y by n – 1, where n is the number of remaining observations after removing observations with NaN values.

example

y = nanstd(X,flag) returns the standard deviation of X based on the normalization specified by flag. The flag is 0 (default) or 1 to specify normalization by n – 1 or n, respectively, where n is the number of remaining observations after removing observations with NaN values.

example

y = nanstd(X,flag,'all') returns the standard deviation of all elements of X, computed after removing NaN values.

example

y = nanstd(X,flag,dim) returns the standard deviation along the operating dimension dim of X, computed after removing NaN values.

example

y = nanstd(X,flag,vecdim) returns the standard deviation over the dimensions specified in the vector vecdim. The function computes the standard deviations after removing NaN values. For example, if X is a matrix, then nanstd(X,0,[1 2]) is the sample standard deviation of all non-NaN elements of X because every element of a matrix is contained in the array slice defined by dimensions 1 and 2.

example

Examples

collapse all

Find the column standard deviations for matrix data with missing values.

X = magic(3);
X([1 6:9]) = NaN
X = 3×3

   NaN     1   NaN
     3     5   NaN
     4   NaN   NaN

y = nanstd(X)
y = 1×3

    0.7071    2.8284       NaN

Load the carsmall data set.

load carsmall

Compute the population and sample standard deviations for the Horsepower data. The nanstd function ignores the missing value in Horsepower.

y1 = nanstd(Horsepower,1)   % Population formula
y1 = 
45.2963
y2 = nanstd(Horsepower,0)   % Sample formula
y2 = 
45.5268

Find the standard deviation of all the values in an array, ignoring missing values.

Create a 3-by-4-by-2 array X with some missing values.

X = reshape(1:24,[3 4 2]);
X([8:10 18]) = NaN
X = 
X(:,:,1) =

     1     4     7   NaN
     2     5   NaN    11
     3     6   NaN    12


X(:,:,2) =

    13    16    19    22
    14    17    20    23
    15   NaN    21    24

Find the sample standard deviation of the elements of X.

y = nanstd(X,0,'all')
y = 
7.5385

Find the row standard deviations for matrix data with missing values. Specify to compute the sample standard deviations along the second dimension.

X = magic(3);
X([1 6:9]) = NaN
X = 3×3

   NaN     1   NaN
     3     5   NaN
     4   NaN   NaN

y = nanstd(X,0,2)
y = 3×1

         0
    1.4142
         0

Find the standard deviation of a multidimensional array over multiple dimensions.

Create a 3-by-4-by-2 array X with some missing values.

X = reshape(1:24,[3 4 2]);
X([8:10 18]) = NaN
X = 
X(:,:,1) =

     1     4     7   NaN
     2     5   NaN    11
     3     6   NaN    12


X(:,:,2) =

    13    16    19    22
    14    17    20    23
    15   NaN    21    24

Find the sample standard deviation of each page of X by specifying dimensions 1 and 2 as the operating dimensions.

ypage = nanstd(X,0,[1 2])
ypage = 
ypage(:,:,1) =

    3.8079


ypage(:,:,2) =

    3.7779

For example, ypage(1,1,2) is the sample standard deviation of the non-NaN elements in X(:,:,2).

Find the sample standard deviation of the elements in each X(i,:,:) slice by specifying dimensions 2 and 3 as the operating dimensions.

yrow = nanstd(X,0,[2 3])
yrow = 3×1

    7.9102
    7.6904
    8.2158

For example, yrow(3) is the sample standard deviation of the non-NaN elements in X(3,:,:).

Input Arguments

collapse all

Input data, specified as a scalar, vector, matrix, or multidimensional array.

Data Types: single | double

Indicator for the normalization used to compute the standard deviation, specified as 0 or 1.

Data Types: single | double

Dimension to operate along, specified as a positive integer scalar. If you do not specify a value, then the default value is the first array dimension whose size does not equal 1.

dim indicates the dimension whose length reduces to 1. size(y,dim) is 1 while the sizes of all other dimensions remain the same.

Consider a two-dimensional array X:

  • If dim is equal to 1, then nanstd(X,0,1) returns a row vector containing the sample standard deviation for each column.

  • If dim is equal to 2, then nanstd(X,0,2) returns a column vector containing the sample standard deviation for each row.

If dim is greater than ndims(X) or if size(X,dim) is 1, then nanstd returns an array of zeros with the same dimensions and missing values as X.

Data Types: single | double

Vector of dimensions, specified as a positive integer vector. Each element of vecdim represents a dimension of the input array X. The output y has length 1 in the specified operating dimensions. The other dimension lengths are the same for X and y.

For example, if X is a 2-by-3-by-3 array, then nanstd(X,0,[1 2]) returns a 1-by-1-by-3 array. Each element of the output array is the sample standard deviation of the elements on the corresponding page of X.

Mapping of input dimension of 2-by-3-by-3 to output dimension of 1-by-1-by-3

Data Types: single | double

Output Arguments

collapse all

Standard deviation values, returned as a scalar, vector, matrix, or multidimensional array.

More About

collapse all

Sample Standard Deviation

The sample standard deviation S is given by

S=i=1n(xiX¯)2n1.

S is the square root of an unbiased estimator of the variance of the population from which X is drawn, as long as X consists of independent, identically distributed samples. X¯ is the sample mean.

Notice that the denominator in this variance formula is n – 1.

Population Standard Deviation

If the data is the entire population of values, then you can use the population standard deviation,

σ=i=1n(xiμ)2n.

If X is a random sample from a population, then the mean μ is estimated by the sample mean, and σ is the biased maximum likelihood estimator of the population standard deviation.

Notice that the denominator in this variance formula is n.

Extended Capabilities

Version History

Introduced before R2006a

collapse all

R2020b: nanstd is not recommended

nanstd is not recommended. Use the MATLAB function std instead. There are no plans to remove nanstd.

To update your code, change instances of the function name nanstd to std. Then specify the 'omitnan' option for the nanflag input argument.

std offers more extended capabilities for supporting tall arrays, GPU arrays, distribution arrays, C/C++ code generation, and GPU code generation.

See Also

|