How to remove rows in a nested for loop?

I have a matrix Detected_center and a matrix Original_center with in each row a x and y coordinate. I want to compare each time one row of the Detected_center matrix with all the rows in the Original_center matrix and calculate the distance. If the row of the Detected_center is paired with a row of the Original_center as the minimal distance is within a threshold, I want that this row is removed from the Original_center matrix for the sake of computational time.
Can any help me out how to do this in a save way? The beneath way is not save as the nested for loop should change in length.
Threshold = 0.2
for i = 1:length(Detected_center)
for j = 1:length(Original_center)
Distance(j,1) = ...
end
[Min_Distance, position] = min(Distance);
if Min_Distance < Threshold
TP = TP + 1;
Original_center(position,:) = []; %Remove row from matrix, as it is coupled
else
FP = FP + 1;
end
end
Blue center is from the matrix Detected_center, and the red centers are the centers of the matrix Original_center. The distances are calculated from the blue center to all red centers with pythagoras. If the distance between the red center which is the closest to the blue center falls within the threshold, it should be removed from the matrix Original_center.

8 个评论

@S. - pleaser clarify what you mean by the nested for loop should change in length. How will this matter if a row is removed from Original_center?
S.
S. 2022-1-14
编辑:S. 2022-1-14
Well at the start the ''for j = 1:length(Original_center)" is for example j=1:5, with length(Original_center) is 5.
However, if a row is removed from Original_center, this length(Original_center) should be changed to save computational time. Do you understand it?
Physically removing the record from the array during the loop will make for the runtime to go up dramatically if there are many at all of these -- that will cause a reallocation of the full array and involves a copy.
Just mark the rows for deletion and wait until the end to remove them all in one swell foop...which can be done w/o the explicit loop via pdist2 using the optional input named parameter, 'Smallest',N and the optional second return variable which is the index of the computed distances which can be used with the subsequent test of whether are/aren't within the threshold limit.
While the option to control a numeric threshold isn't implemented in pdist2 directly, I would presume you would be able to know how many might possibly be so for your input data to set the number to return. OTOH, you could just return the whole array and do the test afterwards; it probably isn't any slower.
Unfortunately, I can't follow your solution. What do you mean with: ''Just mark the rows for deletion and wait until the end to remove them all in one swell foop'' and ''I would presume you would be able to know how many might possibly be so for your input data to set the number to return''. Moreover, could you provide some code, so I can follow it better.
If the distance between the red center which is the closest to the blue center falls within the threshold, it should be removed from the matrix Original_center.
If so, the order in which you iterate over the Detected_center points will affect the result. Suppose your data is,
Threshold=5;
Detected_center=[4 0;
8 0];
Original_center=[0 0;
6 0;
12 0];
No matter which Detected_center point you start the iterations with, the first point to be eliminated is [6,0]. However, if [4,0] is the last point to be processed then [0,0] will be the final point eliminated and the final value of Original_center will be [12 0], whereas if [8,0] is the last to be processed, then the final eliminated point will be [12 0] and the final value for Original_center will be [0,0].
Did you intend for the process to be sensitive to order in this way? What is the correct outcome in this example?
S.
S. 2022-1-18
编辑:S. 2022-1-18
What you are explaining is completely correct.
For the example above, the matrix of total distances is:
Distance_tot =
[ 4 8;
2 2;
8 4]
Then it depends on the order of the centers in the matrix Detected_center, what outcome you will get.
I didn't take this situation into account, however if you check the coordinates of the centers of Detected_center and Original_center, the chance that a situation like you outline happens, is very very small (I think). This is because, the coordinates in Detected_center are given in four decimals precisely. And therefore, I didn't intend the process to be sensitive to order in this way.
An example of how the matrices Detected_center and Original_center look like:
Detected_center =
[ 223.2095 162.9642;
224.1991 278.4803;
245.2765 209.0459;
254.9793 434.9617]
Original_center =
[ 214.0000 269.0000;
226.0000 153.0000;
256.0000 27.0000;
249.0000 433.0000;
81.0000 328.0000]
This is because, the coordinates in Detected_center are given in four decimals precisely.
Not sure why that matters. My example data had infinite precision (they were integers). But if you think the issue can be ignored, see my answer below.
Well, it decreases the change that you get the same distances, if you have integers it's more likely you will get the exact same distances.

请先登录,再进行评论。

 采纳的回答

It should start with the column (and therefore the Detected_center point) with the overall shortest distance to a Original_center point and then move on to the next Detected_center
This new processing rule can be implemented as follows,
Distances=pdist2(Original_center,Detected_center);
[M,N]=size(Distances);
discard=false(M,1);
while any(isfinite(Distances),'all')
[d,Rows]=min(Distances,[],1);
[dmin,col]=min(d);
if dmin<Threshold
row=Rows(col);
discard(row)=1;
Distances(row,:)=inf;
end
Distances(:,col)=inf;
end
Original_center(discard,:)=[];
TP=nnz(discard);
FP=M-TP;

4 个评论

S.
S. 2022-1-25
编辑:S. 2022-1-25
Thanks! This is exactly what I needed. I think now it's not really sensitive anymore (can't think of any other cases). The only downside is that the speed has drastically decreased.
The speed has to decrease because it's amore complicated algorithm than what you first specified.
A small question about this. How would you write this is pseudocode to make it easy to understand for other readers? Do you have any ideas about that? I have some idea in mind aswell.

请先登录,再进行评论。

更多回答(2 个)

discard=ismembertol(Original_center,Detected_center,Threshold,'ByRows',1,'DataScale',1);
Original_center(discard,:)=[];

7 个评论

Neat idea if OP can recast the distance into a difference in the coordinates...
Yes, this would be the L-inf distance, not the L2 difference.
S.
S. 2022-1-17
编辑:S. 2022-1-17
What do you exacly mean with OP? And indeed where is Min_Distance taken into acount for comparison with the threshold? Original_center and Detected_center are by the way x by 2 matrices, with x a number, which doesn't need to be the same for Original_center and Detected_center. I have added an image for more clarification.
I see that L-Inf distance is defined as: max(abs(x-y)) and the L2 distance is the Euclidean distance. I indeed need the Euclidean distance.
What do you exacly mean with OP?
The OP is you. Notice the upper right corner of your posts.
And indeed where is Min_Distance taken into acount for comparison with the threshold?
Inside ismembertol().
Detected_center are by the way x by 2 matrices, with x a number
That was clear, but what is the typical size of x?
I indeed need the Euclidean distance.
If you need the Euclidean distance, ismembertol won't work, but you should consider carefully whether the Euclidean distance is essential, since that requires slower alternatives.
S.
S. 2022-1-18
编辑:S. 2022-1-18
That was clear, but what is the typical size of x?
x is typically between 4 and 12.
If you need the Euclidean distance, ismembertol won't work, but you should consider carefully whether the Euclidean distance is essential, since that requires slower alternatives.
Well, I need to select the point of the matrix Original_center with the shortest distance to the point of the matrix Detected_center. See image in my original question to clarify this. So it's quite essential.
Well, I need to select the point of the matrix Original_center with the shortest distance to the point of the matrix Detected_center. See image in my original question to clarify this. So it's quite essential.
That doesn't explain why it needs to be the shortest Euclidean distance and not the shortest L-inf distance.
Well, the centers in Original_center are the actual coordinates of oranges in a picture and the Detected_center are the predicted coordinates of the oranges in the picture of the oranges. So for a robot arm to grab the orange, I think the Euclidean distance is necessary and most suitable. If you think something else would be more appropriate, feel free to mention.

请先登录,再进行评论。

For Euclidean distance,
discard=any( pdist2(Original_center,Detected_center)<Threshold ,2);
Original_center(discard,:)=[];

8 个评论

S.
S. 2022-1-18
编辑:S. 2022-1-18
I am thinking about how to implement this.
For pdist2(Original_center,Detected_center)<Threshold, it will give a true value for all distances smaller than the Threshold, however, I want only that the smallest distance that meets the threshold is true.
I think I have fixed it (I think its the most computational efficient way, let me know if you think it's not and what should be different then):
for i = 1:length(Detected_center)
Distance = pdist2(Original_center,Detected_center(i,:))
[Min_Distance, position] = min(Distance);
if Min_Distance < Threshold
TP = TP + 1;
Original_center(position,:)=[] %Remove paired pair
else
FP = FP + 1;
end
end
however, I want only that the smallest distance that meets the threshold is true.
Does it matter if there are ties? If not, this is a more vectorized alternative, but note my comment above.
%Example data
Threshold=5;
Detected_center=[4 0;
8 0];
Original_center=[0 0;
6 0;
12 0];
D = pdist2(Original_center,Detected_center);
[minval,loc]=min(D,[],1);
discard=loc(minval<Threshold);
Original_center(discard,:)=[];
Original_center
Original_center = 2×2
0 0 12 0
This is similair as how I did, right? The If-statement is necessary, as I need to count the false positives (FP) and true positives (TP).
I simplified the smallest distance by this line:
[Min_Distance, position] = pdist2(Original_center,Detected_center(i,:),'euclidean','smallest',1)
No, you don't need loops or if statements:
[Min_Distance, loc] = pdist2(Original_center,Detected_center,'euclidean','smallest',1)
pos=Min_Distance<Threshold;
TP=sum(pos);
FP=numel(pos)-TP;
Original_center(loc(pos),:)=[];
S.
S. 2022-1-18
编辑:S. 2022-1-18
Well, in this case you calculate all distances, which is not necessary and therefore computationally inefficient. Moreover Original_center(loc(pos),:)=[]; is then not necessary. In the for loop I use, the one centerpoint that is coupled is removed from the Original_center matrix, so that the distance doesn't need to be calculated again.
Well, in this case you calculate all distances, which is not necessary and therefore computationally inefficient.
I don't know which version of my solution you're looking at. I return the shortest distance, just as you did. Regardless, I don't think using the 'smallest' flag makes pdist2 compute fewer distances. Output memory allocation is certainly more efficient when the flag is used, but pdist2 still needs to compute all the distances in order to know which one is the minimum.
tic
X=rand(1e6,2); Y=rand(1,2);
toc
Elapsed time is 0.026885 seconds.
tic;
D=pdist2(X,Y);
toc
Elapsed time is 0.468526 seconds.
tic
[D,I]=pdist2(X,Y,'euc','smallest',1);
toc
Elapsed time is 0.010583 seconds.
You can see that the difference in compute time above is basically the amount of time required to allocate memory for the result.
Moreover Original_center(loc(pos),:)=[]; is then not necessary. In the for loop I use, the one centerpoint that is coupled is removed from the Original_center matrix, so that the distance doesn't need to be calculated again.
It is much more efficient to remove the undesired points in one step, as I did, than to do it one at a time in a loop.
S.
S. 2022-1-20
编辑:S. 2022-1-20
The problem is, however, you calculate all the distances, while I calculate less distances, as if a pair is coupled the row is deleted. This is actually not the main issue, because there is now the following issue with the code. If you for example have the following distances with a threshold of 15:
If you use this code (my comment), the first Detected_center point will be coupled to the last row (first column), as that is the point with the minimal distance (8 pixels), while the last Detected_center point will NOT be coupled to any point as the first Detected_center point is already coupled to the last Detected_center point: 1 TP and 1 FP. --> So coupling seems not to be smart.
If you use this code (your comment), the first and last Detected_center point will be coupled to the last row (last Original center point), as it is for both the minimal distance: 2 TP
However, the correct way, is that the first Detected_center point is coupled to the second Original_center point, and the last Detected_center point will be coupled to the last Original_center point: 2 TP. How can I deal with/incorporate this? This is the only thing the software is sensitive to (I think). --> It should start with the column (and therefore the Detected_center point) with the overall shortest distance to a Original_center point and then move on to the next Detected_center point with the second overall shortest distance to a Original_center point etc.

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by