how to determine the optimum number of cluster use K-Mean Clustering

Question

lina 2012-4-5

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/34623-how-to-determine-the-optimum-number-of-cluster-use-k-mean-clustering

回答： Umar 2024-8-8

Hi everyone,

Is there anyone knows how to determine the optimum number of cluster in K-Means Clustering ?? If there's any matlab code for it I very appreciate.

Thanks in advance, Lina

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Walter Roberson 2012-4-5

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/34623-how-to-determine-the-optimum-number-of-cluster-use-k-mean-clustering#answer_43432

Basically, there isn't a way, not really.

There are papers on the topic that show algorithms that have been developed. The algorithms mostly involve running K-Means with a fixed number of clusters, running it again with 1 more cluster, then again with 1 more yet, and so on, and trying to figure out the "best" point in the downturn curve of classification effectiveness.

Since, after all, you can get better classification by running with as many clusters as you have points, the algorithms try to figure out a "reasonable" stopping place where the error rate "isn't too bad" and increasing the number of clusters "don't help much". There is of course a lot of subjectivity about that, and it depends upon having a good measure for "error rate" which you often don't have (and tends to vary with context when you do have it.)

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

kira 2019-5-2

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/34623-how-to-determine-the-optimum-number-of-cluster-use-k-mean-clustering#answer_373360

在 MATLAB Online 中打开

old question, but I just found a way myself looking at matlab documentation:

klist=2:n;%the number of clusters you want to try
myfunc = @(X,K)(kmeans(X, K));
eva = evalclusters(net.IW{1},myfunc,'CalinskiHarabasz','klist',klist)
classes=kmeans(net.IW{1},eva.OptimalK);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 3

Umar 2024-8-8

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/34623-how-to-determine-the-optimum-number-of-cluster-use-k-mean-clustering#answer_1496759

Hi @lina ,

Normally to determine the optimal number of clusters in K-Means Clustering, you can utilize the Elbow Method or the Silhouette Method. These methods help in identifying the appropriate number of clusters based on the data distribution. In example code snippet below efficiently implements the Elbow method by downloading it from the following link below

https://www.mathworks.com/matlabcentral/fileexchange/65823-kmeans_opt

for determining the optimal number of clusters in K-Means Clustering. By following this guide, you can adapt and apply this method to various datasets while ensuring accurate clustering results and insightful visualizations.

% Example code snippet

% Load or generate sample data

X = rand(100, 2); % Example data: 100 points in 2D

% Run k-means optimization

[IDX,C,SUMD,K] = kmeans_opt(X);

% Print results

fprintf('Optimal number of clusters: %d\n', K);

fprintf('Centroids:\n');

disp(C);

fprintf('Sum of distances: %.4f\n', SUMD);

% Plotting results

figure;

hold on;

gscatter(X(:,1), X(:,2), IDX);

plot(C(:,1), C(:,2), 'kx', 'MarkerSize', 15, 'LineWidth', 3);

title('K-Means Clustering Results');

xlabel('Feature 1');

ylabel('Feature 2');

legend('Cluster 1', 'Cluster 2', 'Cluster 3', 'Centroids');

hold off;

Please see attached plot

The code snippet above provides a clear example of how to perform K-Means clustering, it randomly generates 100 data points in 2D, calls the custom function kmeans_opt to perform K-Means clustering.Displays the optimal number of clusters, centroids, and sum of distances and visualizes the clustering results with data points colored by cluster and centroids marked. Please let me know if you have any further questions.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

how to determine the optimum number of cluster use K-Mean Clustering

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（3 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

how to determine the optimum number of cluster use K-Mean Clustering

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（3 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论