本页对应的英文页面已更新,但尚未翻译。 若要查看最新内容,请点击此处访问英文页面。

降维和特征提取

PCA、因子分析、特征选择、特征提取等

特征转换方法可以通过将数据转换为新特征来减少数据的维度。当无法转换变量时(例如,当数据中存在类别变量时),最好使用特征选择方法。有关特别适用于最小二乘拟合的特征选择方法,请参阅逐步回归

函数

全部展开

fscncaFeature selection using neighborhood component analysis for classification
fsrncaFeature selection using neighborhood component analysis for regression
sequentialfsSequential feature selection
relieffRank importance of predictors using ReliefF or RReliefF algorithm
ricaFeature extraction by using reconstruction ICA
sparsefiltFeature extraction by using sparse filtering
transformTransform predictors into extracted features
tsnet-Distributed Stochastic Neighbor Embedding
barttestBartlett’s test
canoncorrCanonical correlation
pcaPrincipal component analysis of raw data
pcacovPrincipal component analysis on covariance matrix
pcaresResiduals from principal component analysis
ppcaProbabilistic principal component analysis
factoranFactor analysis
rotatefactorsRotate factor loadings
nnmfNonnegative matrix factorization
cmdscaleClassical multidimensional scaling
mahalMahalanobis distance
mdscaleNonclassical multidimensional scaling
pdistPairwise distance between pairs of observations
squareformFormat distance matrix
procrustesProcrustes analysis

FeatureSelectionNCAClassificationFeature selection for classification using neighborhood component analysis (NCA)
FeatureSelectionNCARegressionFeature selection for regression using neighborhood component analysis (NCA)

对象

ReconstructionICAFeature extraction by reconstruction ICA
SparseFilteringFeature extraction by sparse filtering

主题

特征选择

Feature Selection

Learn about feature selection algorithms, such as sequential feature selection.

Neighborhood Component Analysis (NCA) Feature Selection

Neighborhood component analysis (NCA) is a non-parametric and embedded method for selecting features with the goal of maximizing prediction accuracy of regression and classification algorithms.

选择用于高维数据分类的特征

此示例说明如何选择用于高维数据分类的特征。具体而言,示例说明如何执行序列特征选择,这是最常用的特征选择算法之一。示例还说明如何使用留出法和交叉验证来评估所选特征的分类性能。

特征提取

Feature Extraction

Feature extraction is a set of methods to extract high-level features from data.

Feature Extraction Workflow

This example shows a complete workflow for feature extraction from image data.

Extract Mixed Signals

This example shows how to use rica to disentangle mixed audio signals.

t-SNE 多维可视化

t-SNE

t-SNE is a method for visualizing high-dimensional data by nonlinear reduction to two or three dimensions, while preserving some features of the original data.

Visualize High-Dimensional Data Using t-SNE

This example shows how t-SNE creates a useful low-dimensional embedding of high-dimensional data.

tsne Settings

This example shows the effects of various tsne settings.

t-SNE Output Function

Output function description and example for t-SNE.

PCA 和典型相关

Principal Component Analysis (PCA)

Principal Component Analysis reduces the dimensionality of data by replacing several correlated variables with a new set of variables that are linear combinations of the original variables.

Analyze Quality of Life in U.S. Cities Using PCA

Perform a weighted principal components analysis and interpret the results.

偏最小二乘回归和主成分回归

此示例说明如何应用偏最小二乘回归 (PLSR) 和主成分回归 (PCR),并讨论这两种方法的有效性。当存在大量预测变量并且它们高度相关甚至共线时,PLSR 和 PCR 都可以作为建模响应变量的方法。这两种方法都将新的预测变量(称为成分)构建为原始预测变量的线性组合,但它们构建这些成分的方式不同。PCR 创建成分来解释在预测变量中观察到的变异性,而根本不考虑响应变量。而 PLSR 会考虑响应变量,因此常使模型能够拟合具有更少成分的响应变量。从实际应用上来说,这能否最终转化为更简约的模型要视情况而定。

Fitting an Orthogonal Regression Using Principal Components Analysis

This example shows how to use Principal Components Analysis (PCA) to fit a linear regression.

因子分析

Factor Analysis

Factor analysis is a way to fit a model to multivariate data to estimate interdependence of measured variables on a smaller number of unobserved (latent) factors.

Analyze Stock Prices Using Factor Analysis

Use factor analysis to investigate whether companies within the same sector experience similar week-to-week changes in stock prices.

Perform Factor Analysis on Exam Grades

This example shows how to perform factor analysis using Statistics and Machine Learning Toolbox™.

非负矩阵分解

Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) is a dimension-reduction technique based on a low-rank approximation of the feature space.

Perform Nonnegative Matrix Factorization

Perform nonnegative matrix factorization using the multiplicative and alternating least-squares algorithms.

多维尺度分析

Multidimensional Scaling

Multidimensional scaling allows you to visualize how near points are to each other for many kinds of distance or dissimilarity metrics and can produce a representation of data in a small number of dimensions.

Classical Multidimensional Scaling

Use cmdscale to perform classical (metric) multidimensional scaling, also known as principal coordinates analysis.

Classical Multidimensional Scaling Applied to Nonspatial Distances

This example shows how to perform classical multidimensional scaling using the cmdscale function in Statistics and Machine Learning Toolbox™.

Nonclassical Multidimensional Scaling

This example shows how to visualize dissimilarity data using nonclassical forms of multidimensional scaling (MDS).

Nonclassical and Nonmetric Multidimensional Scaling

Perform nonclassical multidimensional scaling using mdscale.

普氏分析

Procrustes Analysis

Procrustes analysis minimizes the differences in location between compared landmark data using the best shape-preserving Euclidean transformations

Compare Handwritten Shapes Using Procrustes Analysis

Use Procrustes analysis to compare two handwritten numerals.