Why kmeans gives different results each time?

13 次查看(过去 30 天)
* *I have square binary similarity matrix show the social relation among users, where o means no relation between two users and 1 means there is relation between them.
I used kmeans to do clustering*
f1=dlmread('d:\matlab\r2011a\bin\paper_comm\link_flixster_bin1.txt');
c=kmeans(f1,3);
When run the kmeans more than one times, the results are different.
for example at firs time the cluster 1= 4448 users , cluster 2= 434, and cluster 3=118
But, in second times cluster 1= 4880 users , cluster 2= 119, and cluster 3=1
Why the results are different??*

采纳的回答

John D'Errico
John D'Errico 2014-12-18
kmeans uses random starting values. (READ THE HELP. I just did to verify this.) So why would you expect that the solution will be identical if the start points are not?

更多回答(1 个)

Chetan Rawal
Chetan Rawal 2014-12-18
As John mentioned, the clustering happens by starting at random points, automatically selected by the algorithm. That is why in such a optimization/machine learning problems, you should try multiple iterations and use a validation data set if possible. To get the results closer between different runs, you can try to:
  • Increase number of iterations by increasing 'MaxIter'
  • Use your own starting points with the 'start' name-value pair
Starting with your own seeds instead of randomly selected seeds by MATLAB will ensure a consistent answer.

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by