How can I color my dendrogram plot such that the colors correspond to clusters generated by the CLUSTER function in the Statistics Toolbox?

9 次查看(过去 30 天)
I generate a dendrogram plot by running the following code at the MATLAB prompt:
NumCluster = 3;
rand('state', 7)
data = [rand(10,3); rand(10,3)+1; rand(10,3)+2];
dist = pdist(data, 'euclidean');
link = linkage(dist, 'complete');
clust = cluster(link, 'maxclust', NumCluster);
[H,T,perm] = dendrogram(link, 0);
I would like different sections of the dendrogram plot colored such that they correspond to the clusters returned by the CLUSTER function.

采纳的回答

MathWorks Support Team
You cannot specify the coloring in DENDROGRAM to match the clusters returned by CLUSTER in the Statistics Toolbox. To work around this limitation, you can use the "colorthreshold" option in the DENDROGRAM function as follows:
NumCluster = 3;
rand('state', 7)
data = [rand(10,3); rand(10,3)+1; rand(10,3)+2];
dist = pdist(data, 'euclidean');
link = linkage(dist, 'complete');
clust = cluster(link, 'maxclust', NumCluster);
color = link(end-NumCluster+2,3)-eps;
[H,T,perm] = dendrogram(link, 0, 'colorthreshold', color);
The above code will work for any values of "NumCluster" set to 2 or higher. The idea is to use the distance information returned by the LINKAGE function to identify a distance cut-off point such that coloring the clusters on the dendrogram plot below that point will result in the desired coloring effect. Since the distance information is returned in the third colomn of the "link" variable in ascending order, you can see that the value of "color" is set just below the line that would break the dendrogram plot into "NumCluster" clusters.
NOTE: The above code might not work well in situations with many repeated distance values returned in the "link" variable. This code is only provided as a guideline, and you should modify it as necessary to fit a given problem.
  1 个评论
Cam Salzberger
Cam Salzberger 2016-4-15
Hello Denise,
That's a good suggestion. I do not believe that there is currently an easy way to do this. I have submitted an enhancement request for this functionality, so we may see it in a future release of MATLAB.
You can check the 'Color' property of the lines in the first output of "dendrogram". This would at least give you which color options there are. The lines appear to have been drawn from top-down on the plot, so the last entry in "H" is the top-most line-segment. If you can organize the clusters by which branches off first, you may be able to work out which lines correspond to which nodes. It's a tricky proposition though.
-Cam

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Cluster Analysis and Anomaly Detection 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by