cophenet
Cophenetic correlation coefficient
Syntax
c = cophenet(Z,Y)
[c,d] = cophenet(Z,Y)
Description
c = cophenet(Z,Y)
computes
the cophenetic correlation coefficient for the hierarchical cluster
tree represented by Z
. Z
is
the output of the linkage
function. Y
contains
the distances or dissimilarities used to construct Z
,
as output by the pdist
function. Z
is
a matrix of size (m–1)-by-3, with distance
information in the third column. Y
is a vector
of size m*(m–1)/2.
[c,d] = cophenet(Z,Y)
returns the cophenetic
distances d
in the same lower triangular distance
vector format as Y
.
The cophenetic correlation for a cluster tree is defined as the linear correlation coefficient between the cophenetic distances obtained from the tree, and the original distances (or dissimilarities) used to construct the tree. Thus, it is a measure of how faithfully the tree represents the dissimilarities among observations.
The cophenetic distance between two observations is represented in a dendrogram by the height of the link at which those two observations are first joined. That height is the distance between the two subclusters that are merged by that link.
The output value, c
, is the cophenetic correlation
coefficient. The magnitude of this value should be very close to 1
for a high-quality solution. This measure can be used to compare alternative
cluster solutions obtained using different algorithms.
The cophenetic correlation between Z(:,3)
and Y
is
defined as
where:
Yij is the distance between objects i and j in
Y
.Zij is the cophenetic distance between objects i and j, from
Z(:,3)
.y and z are the average of
Y
andZ(:,3)
, respectively.
Examples
X = [rand(10,3); rand(10,3)+1; rand(10,3)+2]; Y = pdist(X); Z = linkage(Y,'average'); % Compute Spearman's rank correlation between the % dissimilarities and the cophenetic distances [c,D] = cophenet(Z,Y); r = corr(Y',D','type','spearman') r = 0.8279
Version History
Introduced before R2006a
See Also
cluster
| dendrogram
| inconsistent
| linkage
| pdist
| squareform