Scatter-plot of data in which the cluster membership is coded by colors.

2 次查看(过去 30 天)
Hello, I have created a dendrogram of my given data.
NumCluster = 1566;
dist = pdist(alldata, 'euclidean');
GroupsMatrix = linkage(dist, 'complete');
clust = cluster(GroupsMatrix, 'maxclust', NumCluster);
E = evalclusters(alldata,clust,'CalinskiHarabasz')
[H,T,perm] = dendrogram(GroupsMatrix, 1566, 'colorthreshold', 'default');
I want to create now a scatter-plot of the data in which the cluster membership is coded by colors. I have tried to implement it like this
gscatter (alldata(:,1),alldata(:,2), E.OptimalY,'rbgk','xod')
Update:
The error is now gone but all the scatter plot has the same color. How can I cluster the membership by different colors? And is my E chosen correctly for the number of clusters? For my E I have 1566 cluster. I did not know if this is okay.
  2 个评论
the cyclist
the cyclist 2021-5-31
It can be difficult to diagnose issues without the data. Specifically, we don't see how E is defined. Can you upload the data, such that we can actually run your code and reproduce the error?

请先登录,再进行评论。

回答(1 个)

KSSV
KSSV 2021-6-1
编辑:KSSV 2021-6-1
gscatter (alldata(:,1),alldata(:,2),clust,'rbgk','xod')
Check the option E.OptimalY, it is empty []. So all the points are shown by same color/ maekers.
  3 个评论
KSSV
KSSV 2021-6-1
编辑:KSSV 2021-6-1
alldata = csvread('Data.csv') ;
NumCluster = 10; % <-----change cluster number here
dist = pdist(alldata, 'euclidean');
GroupsMatrix = linkage(dist, 'complete');
clust = cluster(GroupsMatrix, 'maxclust', NumCluster);
E = evalclusters(alldata,clust,'CalinskiHarabasz') ;
gscatter (alldata(:,1),alldata(:,2),clust)
Mark S
Mark S 2021-6-1
编辑:Mark S 2021-6-1
Thanks. It works fine. One additional question: How can I find an optimal cluster number? Is it best to vary the NumCluster myself or is there another method? I have found this in the matlab help:
https://it.mathworks.com/matlabcentral/answers/76879-determining-the-optimal-number-of-clusters-in-kmeans-technique
klist=2:500;%the number of clusters you want to try
myfunc = @(X,K)(kmeans(X, K));
eva = evalclusters(alldata,myfunc,'CalinskiHarabasz','klist',klist)
classes=kmeans(alldata,eva.OptimalK);
I get here for my optimalK=3. But I am not sure if this is ok. Is the calculation for the optimal cluster numbers so ok?

请先登录,再进行评论。

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by