Multiple Correspondence Analysis Based on the Burt Matrix.

multiple correspondence analysis, correspondence analysis, categorical analysis, graphical procedure
21.5K 次下载
更新时间 2009/1/3

查看许可证

编者注: Popular File 2009

Statistics fundamentals of the Correspondence Analysis (CA) is presented in the CORRAN and MCORRAN1 m-files you can find in this FEX author''s page. CA can be extended to more than two categorical variables, called Multiple Correspondence Analysis (MCA). CA and MCA are graphical techniques for representing the information in a two-way or higher-order multiway contingency table. They contain the counts (frequencies) of items for a cross-classifications of the categorical variables (Rencher, 2000).

Karl Pearson (1913) developed the antecedent of CA used by Procter&Gamble (Horst 1935). R.A. Fisher (1940) named the approach 'reciprocal averaging' because is reciprocally averages row and column percents in table data until they are reconciled. Since reciprocal averaging was inefficient, Europeans such as Mosaier (1946) and Benzecri (1969) related table data with computer programs for principal component (factor) analysis. Burt (1953) developed MCA (homogeneity analysis) of a binary indicator.

Here, MCA is applied to the Burt matrix (B), the matrix of all two-way cross-tabulations of the categorical variables. The Burt matrix has a square block on the diagonal for each variable (the frequencies for the categories in the corresponding variable) and a rectangular block off-diagonal for each pair of variables (a two-way contingency table for the corresponding pair of variables). In the dual eigenanalysis or
Singular Value Decomposition (SVD) we get the squares of the singular values, or principal inertias.

The so-called 'percentage of inertia problem' can be improved by using adjusted inertias procedure or eigenvalue correction. The adjusted inertias are calculated only for each singular value that satisfies the inequality >= 1/number of variables. They are expressed as a percentage of the average off-diagonal inertia, which can be calculated either by direct calculation on the off-diagonal tables in the Burt matrix. The adjusted solution not only does it considerably improve the measure of fit, but it also removes the inconsistency about the Burt matrix to analyse. This inconsistency is due to artificial dimensions added because one categorical variable is coded with several columns. As a consequence, the inertia (i.e., variance) of the solution space is artificially inflated and therefore the percentage of inertia explained by the first dimension is severely underestimated.

A complete statistics fundamentals explanation is found on Greenacre (2006).

A MCA yields only rows or columns coordinates and each point represents a category (attribute) of one of the variables.

Syntax: function mcorran2(X)

Input:
X - Data matrix=Burt matrix. Size: categorical variables x categorical variables (>2).

Outputs:
Complete Multiple Correspondence Analysis
The adjusted inertias table is given by default
Pair-wise Dimensions Plots. For the vertical and horizonal lines we use the hline.m and vline.m files kindly published on FEX by Brandon Kuczenski [http://www.mathworks.com/matlabcentral/fileexchange/1039]. For connecting lines to the originwe use the plot2org published on FEX by Jos [http://www.mathworks.com/matlabcentral/fileexchange/11337]

引用格式

Antonio Trujillo-Ortiz (2024). Multiple Correspondence Analysis Based on the Burt Matrix. (https://www.mathworks.com/matlabcentral/fileexchange/22558-multiple-correspondence-analysis-based-on-the-burt-matrix), MATLAB Central File Exchange. 检索时间: .

MATLAB 版本兼容性
创建方式 R14
兼容任何版本
平台兼容性
Windows macOS Linux
类别
Help CenterMATLAB Answers 中查找有关 Mathematics 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
版本 已发布 发行说明
1.3.0.0

Text was improved.

1.1.0.0

It was added an appropriate format to cite this file.

1.0.0.0