Best set of elements to find closest mean.

1 次查看(过去 30 天)
Hello,
The title might be unclear but I didnt find any better.
- Let's have an array A of two columns filled with cooridnate values Ax and Ay. Those coordinates define points in a scatterplot. I use kmeans with two clusters that give me the coordinates of the two centroides Acx and Acy. Let's have now an array B with similarly two columns of values Bx and By.
I want to find how to separate the points in B in two sets so that the mean of each set is as close as possible of the centroids Axc and Ayc. I started thinking about a way using minimum distance but it seems like de result can be wrong... The result may be a single set if the points are too far from the other centroid. Since I need to repeat the process a huge number of time I'd like it to be as fast as possible.
Any help or clue is welcome. KeFop
[EDIT] Here to explain my point:
close all
clear
%points with obvious two clusters
Apts = rand(10,2);
Bpts = rand(10,2);
Bpts(:,1) = Bpts(:,1)+2;
Total = cat(1,Apts, Bpts);
%adding one point at the edge of the two clusters
Sp = mean(Total)*1.2;
Total = cat(1, Total, Sp);
%let's add the centroïds, imported from a previous classification, nearby the means
%of the clusters
[~, ClustMean] = kmeans([Total(:,1), Total(:,2)], 2, 'MaxIter', 100, 'Display', 'off','Replicates', 3);
centroids = ClustMean;
centroids(:,1) = centroids(:,1)+0.2;
%Define best set of points of mean as close as possible of the imported
%centroids
for t = 1:length(Total)
for c = 1:2
EuclDist(t,c) = sqrt(sum((Total(t,1)- centroids(c,1)).^2 + (Total(t,2)- centroids(c,2)).^2));
end
[~, ClustSelect(t)] = min(EuclDist(t,:));
end
figure
hold all
%points with colors depending on cluster
scatter(Total(ClustSelect==1,1), Total(ClustSelect==1,2), 'r');
scatter(Total(ClustSelect==2,1), Total(ClustSelect==2,2), 'b');
%mean of the two clusters
scatter(ClustMean(1,1), ClustMean(1,2), 60, 'r', 'filled');
scatter(ClustMean(2,1), ClustMean(2,2), 60, 'b', 'filled');
%centroids
scatter(centroids(1,1), centroids(1,2), 'g', 'd', 'LineWidth', 10);
scatter(centroids(2,1), centroids(2,2), 'g', 'd', 'LineWidth', 10);
legend('Points of cluster A', 'Points of cluster B','Mean of cluster A','Mean of cluster B', 'Importate centroid', 'Importate centroid', 'Location', 'northeastoutside');
%Now if manually switching the point in the middle from one cluster to the
%other
ClustSelect2 = ClustSelect;
if ClustSelect2(end) == 1
ClustSelect2(end) = 2;
elseif ClustSelect2(end) == 2
ClustSelect2(end) = 1;
end
%recalculate the new means
MeanClustA = mean(Total(ClustSelect2==1,:));
MeanClustB = mean(Total(ClustSelect2==2,:));
figure
hold all
%points with colors depending on cluster
scatter(Total(ClustSelect2==1,1), Total(ClustSelect2==1,2), 'r');
scatter(Total(ClustSelect2==2,1), Total(ClustSelect2==2,2), 'b');
%mean of the two modified clusters
scatter(MeanClustA(1,1), MeanClustA(1,2), 60, 'r', 'filled');
scatter(MeanClustB(1,1), MeanClustB(1,2), 60, 'b', 'filled');
%centroids
scatter(centroids(1,1), centroids(1,2), 'g', 'd', 'LineWidth', 10);
scatter(centroids(2,1), centroids(2,2), 'g', 'd', 'LineWidth', 10);
legend('Points of cluster A', 'Points of cluster B','Mean of cluster A','Mean of cluster B', 'Importate centroid', 'Importate centroid', 'Location', 'northeastoutside');

采纳的回答

Image Analyst
Image Analyst 2017-3-31
Simply use sqrt(). So you have two clusters. One cluster has a centroid at (acx1, acy1), and the other cluster is centered at (acx2, acy2). To find the two sets of B that will have the closest means, simply assign the points in B to whatever centroid of A they're closest to. Let's say you have two arrays bx and by which have the x and y coordinates of points in set B. Try this untested code:
distancesToACluster1 = sqrt((bx-acx1).^2 + (by-acy1).^2);
distancesToACluster2 = sqrt((bx-acx2).^2 + (by-acy2).^2);
% Find out which elements are closest to A centroid #1:
closestTo1 = distancesToACluster1 < distancesToACluster2;
% Find out which elements are closest to A centroid #2:
closestTo2 = ~closestTo1;
% Extract points from b into set 1
bx1 = bx(closestTo1);
by1 = by(closestTo1);
% Extract points from b into set 2
bx2 = bx(closestTo2);
by2 = by(closestTo2);
  4 个评论
KeFop
KeFop 2017-4-1
I edited my message to add some figures.
Image Analyst
Image Analyst 2017-4-2
I don't have any data files to work with so I can't do anything. I'm not going to hand type in all those values from your figure.

请先登录,再进行评论。

更多回答(0 个)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by