One-dimensional Kullback-Leibler divergence of two independent data groups to measure class separability
relativeEntropy is a function used in code generated by Diagnostic Feature
calculates the one-dimensional Kullback-Leibler divergence of two independent subsets of
Z = relativeEntropy(
X that are grouped according to the logical labels in
I. The relative entropy provides a metric for ranking features
according to their ability to separate two classes of data, such as healthy and faulty
machines. The entropy calculation assumes that the data in
X follows a
Code that is generated by Diagnostic Feature
relativeEntropy when ranking features with this
X— Data samples to group
Data set containing data samples that can be logically classified into two groups, specified as a vector when you have a single set of samples, such as values for one feature, and a matrix when you have multiple sets of samples.
X contains a single set of n
features, such as a multiple features extracted from a single data source,
X is a 1-by-n vector.
X contains m sets of
X is an
m-by-n matrix. Each row in
X represents one data source and must correspond to a
single logical class.
X must contain at least two rows that correspond to
the logical class in
0 and two rows that
correspond to the label
1 to calculate legitimate relative entropy
For example, suppose that you have a set of five features for each of 20 gearboxes
and you are computing the relative entropy to assess these features.
X is a 20-by-5 matrix. Each row represents a gearbox that is
either healthy or faulty, as indicated by the associated logical class label of
1. At least two gearboxes must be healthy
and at least two gearboxes must be faulty. The relative entropy indicates how well each
feature separates the data for the healthy gearboxes from the data for the faulty
I— Logical classification label
Logical classification label that assigns the rows in
X to one
of two logical classes, specified as a vector of length m, where
m is the number of rows in
For example, suppose once more that
X is a 20-by-5 matrix
corresponding to 20 gearboxes. The first 9 gearboxes are healthy. The remaining 11
gearboxes are faulty. Define the healthy state as
0 and the faulty
I has a length of 20. The first
9 labels in
I are equal to
0 and the remaining
11 labels are equal to
Z— Relative entropy
Relative entropy of two labeled groups, returned as a scalar or a vector.
X is a vector, then
Z is a
X is a matrix, then
relativeEntropy calculates the distance separately for each
Z is then a vector of length n,
where n is the number of columns in
X as missing values and ignores them.
 Theodoridis, Sergios, and Konstantinos Koutroumbas. Pattern Recognition, 175–177. 2nd ed. Amsterdam; Boston: Academic Press, 2003.