Kmeans failed to converge after 10 million iterations! (BIG DATA)

5 次查看(过去 30 天)
Hello everyone,
I'm working with a matrix X ( 11 000 000 x 27 (single) matrix )
I want to cluster my data into k clusters, where k can be any integer from 4 to 20. I have standardise my data as I have values with very different units, and I'm using the kmeans() function in R2018a MATLAB.
X = bsxfun(@minus, X, mean(X));
X = bsxfun(@rdivide,X,std(X));
rng(0) % for repeatability
[km_ind,~,sumd] = kmeans(X,k,'MaxIter',10000000,'Replicates',5);
I have tried with up to 10 million 'MaxIter', but I still don't get convergence. I have tried different values for k and it doesn't change the warning message. Sometimes it gives me the warning message in a matter of seconds, and I doubt 10 million iterations were done in a couple of seconds.
Warning: Failed to converge in 10000000 iterations during replicate 1.
In kmeans/loopBody (line 476)
In internal.stats.parallel.smartForReduce (line 136)
In kmeans (line 343)
What am I missing ? what am I doing wrong ? any suggestions?
Thanks very much
EDIT 1: I have uploaded the first 60 000 observations of my data (already standardised). I also have problems when clustering this subset, and does not converge after a 10 million iterations.
EDIT 2: New information: I've compared the clustering results using: - 10^9 iterations (thousahd million iterations!) - 10^8 iterations - 10^7 iterations - 10^5 iterations - 10^4 iterations - 10^3 iterations - 10^2 iterations - 10 iterations and some more in between 10 and 50 iterations, and although I always receive the non-convergence warning, the result actually stops changing somewhere in between iteration 15th and 20th. What could make matlab yield the non-convergence message even when there actually seems to be a convergence in the results?
  3 个评论
Ame ZL
Ame ZL 2018-8-3
编辑:Ame ZL 2018-8-3
Hi,
I've uploaded the first 60 000 observations just now (in my code below is X ),
I'm copying the exact code that I just tried with this subset, and that in less than a second gave me the results with the warning that failed to converge after 10 million iterations.
rng(7)%for repeatability
[km_ind,~,sumd] = kmeans(X,4,'MaxIter',10000000);
Thanks for your help
Ame ZL
Ame ZL 2018-8-3
New information:
I've compared the clustering results using:
- 10^9 iterations (thousahd million iterations!)
- 10^8 iterations
- 10^7 iterations
- 10^5 iterations
- 10^4 iterations
- 10^3 iterations
- 10^2 iterations
- 10 iterations
and some more in between 10 and 50 iterations, and although I always receive the non-convergence warning, the result actually stops changing somewhere in between iteration 15th and 20th.
What could make matlab yield the non-convergence message even when there actually seems to be a convergence in the results?

请先登录,再进行评论。

回答(1 个)

Image Analyst
Image Analyst 2018-8-2
What makes you think there are 4 to 20 clusters? Any basis to justify that belief?
If you have some that you think are in different clusters, then use them as training points and try k nearest neighbors. I believe, from the nature of KNN, it must converge. Or try random forest, which is kind of like an ad hoc big if-then-else statement.
  5 个评论
xiaoyu Guo
xiaoyu Guo 2020-12-30
Hi, friend. Thank you for your respones, this is my code and the warning messege ^..^
locations = [-50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 39.3427
50.5495 -50.5495 -50.5495 50.5495 -61.1174
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 0.4623 -50.5495 50.5495
-51.5586 -51.3403 18.0984 51.6833 51.3682
-48.6515 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
50.6331 50.9448 -50.8686 51.1881 50.7382
50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 -50.5495 -50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 50.5495
50.5495 50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 -50.5495 50.5495 -50.5495 50.5495
-48.3667 -48.3661 -50.5495 48.3704 -50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 50.5495
-49.8511 50.5495 50.4808 -49.9866 50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 -50.5495
-50.7609 50.7658 -50.9170 -50.6149 -50.7334
-50.5495 50.5495 -50.5495 50.5495 50.5495
50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 -18.2567 50.5495 50.5495 -50.5495
50.5495 50.5495 50.5495 -50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 50.5495 50.5495 -5.5231
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 50.5495 50.5495 -50.5495 50.5495
-48.4194 -48.4162 -48.4178 -48.4253 48.4252
50.5495 50.5495 50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 50.5495 50.5495 -50.5495
49.7824 -50.5495 49.7925 50.5495 50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
50.6832 -50.6784 -50.9326 -50.6356 -50.7521
50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.6518 -50.5495 -50.5495 50.8647 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.5495 -50.5495 50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.3267 -50.5073 -50.3652 50.5495 50.5495
50.5495 -53.1569 -53.0801 -53.0489 -49.2770
50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
-51.3052 -51.3003 51.2401 -51.2463 51.1772
-50.5495 50.5495 50.5495 -50.5495 -50.5495
-22.3064 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-51.4373 5.0256 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 50.5495 -50.5495 50.5495
50.5495 50.5495 -50.5495 -50.5495 -50.5495
53.1362 -53.3025 -53.7331 53.1591 53.4506
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
9.5550 50.0806 50.0519 -50.5495 -49.9163
50.5495 -50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 50.5495 -15.4506 -50.5495 50.5495
-51.1604 -51.0961 -51.2585 51.1909 -51.1432
-50.5495 50.5495 -50.5495 50.5495 -50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5358 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 -50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 10.8325 -50.5495 26.5040 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.5495 -50.5495 -50.5495 50.5495 50.5495
-50.5495 50.5495 50.5495 50.5495 -50.5495
50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 20.8873 -50.5495 -50.5495 50.5495
-49.4051 49.8725 49.9978 -50.5495 49.4987
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 50.5495
44.4379 50.5495 -49.1841 50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.7957 3.4591 50.9267 17.5614 -51.0476
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 -50.5495 50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
51.6287 -51.6654 27.4835 -80.6303 -51.7033
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 34.3298 -50.5495 50.5495 -50.5495
-50.5495 -50.5495 50.5495 -50.5495 -41.0029
-52.1724 52.0889 -52.0813 -1.3367 -52.0970
-50.5296 -50.5495 -50.4764 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 50.5495
50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 50.5495
50.5495 50.5495 -50.5495 -50.5495 50.5495
-51.6331 -51.6138 51.5852 -51.6056 51.6199
50.5495 -50.5495 -50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 50.5495
-47.7350 50.5495 -50.5495 50.5495 50.5495
-51.5267 -51.3029 -51.4251 51.6544 51.3315
-50.5495 50.5495 -6.4264 50.5495 50.5495
50.5495 -50.5495 -49.2439 49.6435 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-51.0637 -50.9902 -51.1759 51.0985 -51.0440
50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 50.5495 50.5495 50.5495
-50.5495 -50.5495 50.5495 -50.5495 50.5495
-50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.7550 -50.6520 -50.9120 50.8037 -50.7274
50.5495 -52.7135 50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
-54.0862 50.5495 5.1698 50.5495 21.2564
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -1.6591
50.5495 -48.4415 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 50.5495
50.5495 50.5495 -50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
52.7802 52.7886 52.7840 -52.7754 52.7830
51.6766 -51.6754 -50.5495 50.5495 51.6828
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 -47.4320 -50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
50.5495 50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-34.6435 -51.7562 -51.8423 -36.6610 51.7763
50.5495 50.5495 -50.2185 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 50.5495
50.5495 -50.5495 -50.5495 -50.5495 50.5495
-52.2000 -52.1884 -50.5495 -50.5495 -52.2461
50.5495 50.5495 -50.5495 -50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
50.5495 50.5495 50.5495 -50.5495 -10.9572
-50.5495 50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 -50.5495 50.5495
-50.5495 -50.5495 50.5495 50.5495 -50.5495
50.5495 -49.9224 50.5495 -50.1515 -50.5086
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.5495 50.5495 -50.5495 -50.5495 -50.5495
-52.9637 50.5495 -50.5495 -50.5495 50.5495
50.5495 50.5495 -50.5495 -50.5495 50.5495
-50.5495 -52.7976 -52.7115 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 50.5495 50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 -6.7704 -50.5495 50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 50.5495 50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 50.5495
50.5495 -50.5495 -50.5495 50.5495 50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.6719 50.5495 -50.5495 -50.5495 -50.7624
-30.3484 -50.5495 -50.5495 31.3433 -50.5495
50.5495 50.5495 50.5495 -50.5495 -50.5495
50.9496 -50.9455 -50.5495 -50.5495 -51.0070
-50.5495 -50.5495 -50.5495 -50.5495 -50.5514
-50.9228 -50.5961 50.6760 -50.5495 50.6378
50.5495 -50.5495 50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 -50.5495
50.5495 50.5495 50.5495 -50.5495 50.5495
50.5495 50.5495 50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 50.5495 -50.5495
-51.8893 -51.8876 51.8676 -51.8697 51.8467
50.6673 -50.7316 -50.8992 51.2138 50.7708
50.5495 50.5495 15.7765 11.8970 50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -8.2574 2.4745 50.5495 50.5495
50.1603 50.0487 -50.0276 -50.0394 -50.0278
-50.5495 50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.5495 -50.4651 50.5495 -50.3473 -50.3293
23.2047 -53.5870 50.5495 -56.2688 50.5495
-50.5495 50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 -52.4671 -50.5495 50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-52.0829 52.0714 52.0757 -52.0764 -52.0851
-51.1122 -51.0331 -51.2328 -50.9995 -50.5495
50.5495 50.7897 50.6055 -50.5495 -50.7466
-28.3830 -50.5495 -50.5495 -12.3907 -50.5495
-50.5495 -50.5495 50.5495 50.5495 -50.5495
50.5495 -50.5495 50.5495 50.5495 50.5495
-50.7739 50.7787 50.5495 -50.6297 -50.7468
-50.5495 -50.5495 50.5495 -50.5495 -50.5495
52.0654 52.1130 52.0709 52.0657 52.0963
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
-8.1983 -50.5495 -50.5495 -18.5050 50.5495
-50.5495 50.5495 50.5495 50.5495 50.5495
-50.5495 50.5495 50.5495 -50.5495 50.5495
-50.5495 -50.5495 -50.5495 50.5495 -50.5495
50.5495 -50.5495 -50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-19.2620 50.5495 -50.5495 19.6786 50.5495
-50.5495 50.5495 -50.5495 50.5495 -50.5495
50.5495 50.5495 50.5495 50.5495 -50.5495
-50.5495 50.5495 -50.5495 50.5495 50.5495
-50.5495 -50.5495 -50.5495 -50.5495 50.5495
-50.5495 -50.5495 50.5495 50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
50.5495 -50.5495 -50.5495 -50.5495 50.5495
50.5495 -50.5495 -50.5495 -50.5495 -50.5495
-50.9010 50.8318 -50.7511 -50.5495 -50.7897
51.1295 -51.2401 -51.1991 -51.1824 51.1085
-50.5495 50.5495 -50.5495 50.5495 -50.5495
-50.5495 -50.5495 50.5495 50.5495 -50.5495
-50.5495 -50.5495 -50.5495 -50.5495 -50.5495
];
for i=2:16 % i want to know the best k in k-means
cidx = kmeans(locations,i,'distance','cityblock','MaxIter',200000000);
disp('number1')
value(1,i) = mean(silhouette(locations,cidx,'cityblock'));
disp('number2')
end
Image Analyst
Image Analyst 2020-12-30
xiaoyu, this is not a Answer to AME's question. Post the link to your totally separate, new discussion thread so that we don't keep sending emails to Ame about new activity on this thread, by editing your question above to remove the code and data and give the link to the new question that is all your own.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by