Deleting X-Y points that are not near other points on a field of data points

9 次查看(过去 30 天)
I have a set of data. This data is around 900 rows of two columns. Each row has an X and a Y value which specifies a point on the X-Y plane. The X-Y plane is from 0 to 100 and 0 to 100 respectively. All of these points are randomly scattered throughout the X-Y plane. My problem is there are too many X-Y points cluttering up the scatter plot. So what I want to do is have Matlab look at each point and say: Is this point a distance of 10 or less to another point. If it is then keep it. If it isn’t then delete the row containing that X, Y value. A shortened example of my data:
X=[1 2 3 4 20];
Y=[1 3 4 3 59];
Since (20,59) is more than a distance of 10 away from the other points, delete it and return the following:
X2=[1 2 3 4];
Y2=[1 3 4 3];
If anyone knows how I could do this, It would be a very great thing.

采纳的回答

per isakson
per isakson 2012-6-6
See Doug's video Advanced: making a 2d or 3d histogram to visualize data density and search the FEX for "hist2"
I failed to find a solution in the FEX. Here is a naive code with "10" hard-coded in the magic number "100".
X=[1,2,3,4,20];
Y=[1,3,4,3,59];
to_be_removed = false(size(X));
for ii = 1 : length(X)
is = (X-X(ii)).^2+(Y-Y(ii)).^2 <= 100;
is(ii) = false;
if not( any( is ) )
to_be_removed(ii) = true;
end
end
X(to_be_removed)=[];
Y(to_be_removed)=[];

更多回答(2 个)

Geoff
Geoff 2012-6-6
Naive (brute force) implementation given by per isakson looks sufficient for this problem. O(N^2) is okay for 900 rows. For larger sets, I'd consider partitioning the points into a quad tree.
However, without making things complicated, I would say that the number of candidates for removal will be small due to your X and Y range. You could easily speed up the naive algorithm by first approximating the local point-density into a 21x21 array (cel-sizes of 5 with extra one for the ends) and then only do a search on points that are unique to a cel address.
  2 个评论
charles atlas
charles atlas 2012-6-7
Sorry I havent been able to get into the office and test the code until today.
the read data is latitude and longitudes but for simplicity's sake, I said it was 0 to 100 on the X and Y axis (which would actually be the longitude and latitude axes respectively.
The code did what it was supposed to do when I tested it, but It neglected half the values that were jumbled together (that is at a distance of about 600 yards away, aka <= .005 as a difference in lat and long squared, added and then square rooted.

请先登录,再进行评论。


Image Analyst
Image Analyst 2012-6-7
If you displayed it as an image instead of a scatterplot, you wouldn't have that problem. Why not give it a try?

类别

Help CenterFile Exchange 中查找有关 Loops and Conditional Statements 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by