Creating Discriminant Analysis Model
The model for discriminant analysis is:
Each class (
Y
) generates data (X
) using a multivariate normal distribution. In other words, the model assumesX
has a Gaussian mixture distribution (gmdistribution
).For linear discriminant analysis, the model has the same covariance matrix for each class; only the means vary.
For quadratic discriminant analysis, both means and covariances of each class vary.
Under this modeling assumption, fitcdiscr
infers the mean and covariance parameters of each class.
For linear discriminant analysis, it computes the sample mean of each class. Then it computes the sample covariance by first subtracting the sample mean of each class from the observations of that class, and taking the empirical covariance matrix of the result.
For quadratic discriminant analysis, it computes the sample mean of each class. Then it computes the sample covariances by first subtracting the sample mean of each class from the observations of that class, and taking the empirical covariance matrix of each class.
The fit
method does not use prior probabilities or costs for fitting.
Weighted Observations
fitcdiscr
constructs weighted classifiers using the following scheme. Suppose M is an N-by-K class membership matrix:
Mnk = 1 if observation n is from class k
Mnk = 0 otherwise.
The estimate of the class mean for unweighted data is
For weighted data with positive weights wn, the natural generalization is
The unbiased estimate of the pooled-in covariance matrix for unweighted data is
For quadratic discriminant analysis, fitcdiscr
uses K = 1.
For weighted data, assuming the weights sum to 1, the unbiased estimate of the pooled-in covariance matrix is
where
is the sum of the weights for class k.
is the sum of squared weights per class.