Command "cluster" with big data: it used to work fast but now it works slow
2 次查看(过去 30 天)
显示 更早的评论
Hello!
I have matrix in variable " dat".
Number of rows = 564372
Number of columns = 11
Each row represents an observation and I need to cluster this data. Command " kmeans" works fast and now I'm trying agglomerative clusterisation. I computed linkage (it took about 8 hours) with the command:
Z=linkage(dat,'centroid','euclidean','savememory','on');
Then I came home and I computed few cluster with different thresholds:
T=cluster(Z,'cutoff',1.4);
I was extremely surprised when I saw that the cluster computation took only 10-15 seconds and the result was fine. Then I saved my linkage data:
Z=dlmwrite('Z-linkage.txt',Z);
Next day I launched Matlab, imported Z-linkage.txt and tryed to compute cluster again. But for this time it works very slow. It may take hours and I don't have any idea what is the problem?
Please help!
Thank you for any suggestion
0 个评论
回答(1 个)
John D'Errico
2016-11-30
Since we have absolutely nothing to go on about the actual data, I can only guess.
Clustering tools usually use random starts. That means you may get lucky some times, seeing rapid convergence.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!