Optimize clustering with specific number of elements per cluster
3 次查看(过去 30 天)
显示 更早的评论
Okay, so I've been at this problem for a couple weeks now and I seem to have hit a snag.
The idea is this to take a group of 250 x and y coordinates and cluster them into different groups, but each group can only have a maximum of 15 points with a tolerance of +- 2.
I've tried using a few variations of Kmeans, but unfortunately it doesn't quite do the trick since kmeans can't take into account a limit to points in a group.
here are the initial points (axes removes for security of information):
and here is an attempt where I used multiple iterations of kmeans to divide the data:
You can see that it is pretty close, but there are some groups that have too few points.
Attached you'll find 2 files. The Optimization Test file is the one where i generated those figures using multiple iterations of kmeans, and the kmeansModified is a piece of code I wrote (sorry for all of the comments, its a work in progress) which tries to solve the number of elements per group problem, but I don't think I did a good job at it:
Anyone have any ideas on this? I understand that this is a pretty complex problem, and from what I have researched, there isn't a good way of trying to solve this problem, but I'd still like to see what your opinions are!
thanks,
David
1 个评论
Image Analyst
2016-6-7
编辑:Image Analyst
2016-6-7
No files are attached. Can you supply more background (the "use case") on why this is needed. It's an interesting problem and is very tough. I'm no expert in optimization so the only thing I could think of was to iteratively use pdist2() to find the group of 15 points that had the lowest average distance between then and classify them as one group. Then remove those points and repeat with the remaining points. I think it might work good at the start but when you start to get to where there are few remaining points, those points might be very widely separated. So the points in the early clusters would be close together but in later clusters, they'd be far apart.
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Cluster Analysis and Anomaly Detection 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!