Main Content

manova1

One-way multivariate analysis of variance (MANOVA)

    Description

    d = manova1(X,group) performs a one-way multivariate analysis of variance (MANOVA) and returns an estimate d for the dimension of the space containing the group means. To perform the MANOVA, manova1 uses the factor in group and the data in X.

    d = manova1(X,group,alpha) also specifies the significance level for the MANOVA.

    [d,p] = manova1(___) also returns the p-value p corresponding to d, using any of the input argument combinations in the previous syntaxes.

    example

    [d,p,stats] = manova1(___) also returns a structure stats containing additional MANOVA statistics.

    example

    Examples

    collapse all

    Load the carbig data set.

    load carbig

    Calculate the dimension of the space containing the group mean vectors and the corresponding p-values.

    [d,p] = manova1([MPG Acceleration Weight Displacement],...
                    Origin)
    d = 
    3
    
    p = 4×1
    
        0.0000
        0.0000
        0.0075
        0.1934
    
    

    The output shows that enough evidence exists to reject the null hypothesis that the mean vectors are statistically the same. However, not enough evidence exists to reject the null hypothesis that the mean vectors lie in the same 3D space.

    Load the fisheriris data set.

    load fisheriris;

    The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flower: the length and width of sepals and petals in centimeters.

    Perform a one-way MANOVA to test the null hypothesis that the vector of means for the four measurements is the same across the three flower species. Specify the significance level. Calculate the dimension of the space containing the vectors for the three flower species, the corresponding p-values, and additional statistics for the MANOVA.

    [d,p,stats] = manova1(meas,species,0.01)
    d = 
    2
    
    p = 2×1
    10-7 ×
    
        0.0000
        0.5786
    
    
    stats = struct with fields:
               W: [4x4 double]
               B: [4x4 double]
               T: [4x4 double]
             dfW: 147
             dfB: 2
             dfT: 149
          lambda: [2x1 double]
           chisq: [2x1 double]
         chisqdf: [2x1 double]
        eigenval: [4x1 double]
        eigenvec: [4x4 double]
           canon: [150x4 double]
           mdist: [150x1 double]
          gmdist: [3x3 double]
          gnames: {3x1 cell}
    
    

    The output shows that the vectors of means for the three species are contained in a two-dimensional space. This result indicates that one of the vectors is statistically different from the others. The stats structure contains additional statistics for the MANOVA.

    Inspect the canonical response data for the MANOVA.

    C = stats.canon
    C = 150×4
    
       -8.0618    0.3004    0.0287    0.2769
       -7.1287   -0.7867    0.8907   -0.0714
       -7.4898   -0.2654    0.1792   -0.5257
       -6.8132   -0.6706   -0.3940   -0.7182
       -8.1323    0.5145   -0.4776    0.0508
       -7.7019    1.4617   -0.4069    0.4651
       -7.2126    0.3558   -0.4843   -0.9609
       -7.6053   -0.0116   -0.2433    0.0825
       -6.5606   -1.0152   -0.0342   -1.1131
       -7.3431   -0.9473   -0.0903    0.1119
          ⋮
    
    

    Each column of C corresponds to a canonical variable, and each row contains a transformed data point corresponding to the same row in X. For more information about canonical variables, see Canonical Variables.

    Create a scatter plot using the first and second canonical variables.

    gscatter(C(:,1),C(:,2),species)

    Figure contains an axes object. The axes object contains 3 objects of type line. One or more of the lines displays its values using only markers These objects represent setosa, versicolor, virginica.

    The scatter plot shows two main clusters of data, with the measurements for setosa in one cluster and the measurements for versicolor and virginica in the other. This result also shows that the vectors of means for the three species are contained in a two-dimensional space.

    Input Arguments

    collapse all

    Data, specified as a numeric matrix with n rows, where n is the number of observations. The columns of X correspond to the elements of the multivariate means.

    Data Types: single | double

    Factor values, specified as a categorical, numeric, or string vector, or a cell array of character vectors. group must contain n elements, where n is the number of rows in X. Each element of group represents the factor value of the data in the corresponding row of X.

    Example: [1,2,1,3,1,...,3,1]

    Example: ["white","red","white",...,"black","red"]

    Data Types: single | double | string | cell | categorical

    Significance level for the MANOVA, specified as a scalar between 0 and 1. For more information, see Algorithms.

    Example: 0.01

    Data Types: single | double

    Output Arguments

    collapse all

    Estimate of the dimension of the space containing the mean vectors, returned as a nonnegative scalar. d is less than or equal to the number of rows in X. For more information, see Algorithms.

    p-values for the MANOVA, returned as a nonnegative vector of length d. p contains a p-value for each dimension manova1 tests when calculating d. For more information, see Algorithms.

    Data Types: single | double

    Additional MANOVA results, returned as a structure with the following fields.

    FieldContents
    W

    Within-groups sum of squares and cross-products matrix

    B

    Between-groups sum of squares and cross-products matrix

    T

    Total sum of squares and cross-products matrix

    dfW

    Degrees of freedom for W

    dfB

    Degrees of freedom for B

    dfT

    Degrees of freedom for T

    lambda

    Vector of values of the Wilks' lambda test statistic for testing whether the means have dimension 0, 1, and so on.

    chisq

    Transformation of lambda to an approximate chi-square distribution

    chisqdf

    Degrees of freedom for chisq

    eigenval

    Eigenvalues of W-1B

    eigenvec

    Eigenvectors of W-1B, the coefficients for the canonical variables C scaled so the within-groups variance of the canonical variables is 1

    canon

    Canonical variables, equal to XC*eigenvec, where XC is X with the columns centered by subtracting their means (see Canonical Variables).

    mdist

    Vector of Mahalanobis distances from each point to the mean of its group

    gmdist

    Matrix of Mahalanobis distances between each pair of group means

    Data Types: struct

    More About

    collapse all

    Canonical Variables

    The canonical variables canon are linear combinations of the original variables that maximize the separation between groups. canon(:,1) is the linear combination of the X columns that has the maximum separation between groups. Among all possible linear combinations, canon(:,1) has the most significant F-statistic in a one-way analysis of variance (ANOVA). canon(:,2) has the maximum separation subject to it being orthogonal to canon(:,1), and so on.

    Algorithms

    manova1 determines d by calculating a test statistic for each possible value of d. The formula for the test statistic is

    (n1l+r2)log(λ),

    where n is the number of observations, l is the number of factor levels, r is the number of response variables, and λ is Wilks' lambda. For more information about Wilks' lambda, see Multivariate Analysis of Variance for Repeated Measures.

    The largest possible value of d is the minimum between the number of response variables and one less than the number of factor levels. d is the largest value for which the p-value is less than the significance level specified by alpha.

    Alternative Functionality

    Instead of using manova1, you can create a manova object using the manova function, and then use the barttest object function to calculate the dimension of the space containing the group means. The advantages of using the manova function include:

    • Support for two-way and N-way MANOVA

    • Table support for factor and response data

    • Additional properties of the manova object, including those for the fitted MANOVA model coefficients, degrees of freedom for the error, and response covariance matrix

    References

    [1] Krzanowski, Wojtek. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

    [2] Morrison, Donald F. Multivariate Statistical Methods. 2nd ed, McGraw-Hill, 1976.

    Version History

    Introduced before R2006a