Strange output with function sortrows

4 次查看(过去 30 天)
Hello. I have a 4 column matrix with many (1000s) rows. I want the data sorted by the first column (which basically has multiple identical entries for each unique entry) preserving the rows. So naturally, I use sortrows. The problem is, the output variable has just one row messed up every time the value in the first column changes.
For clarity, this is the .mat file of the output variable. For example, in the second column, row no 74 and 75, after -0.010, 0.010 is shown instead of -0.09. Rest all the entries are fine. This happens for every unique entry of the first column but only does not happen with the first unique entry. Can anyone explain this to me or give a possible solution?
EDIT - .mat file attached now. It has the parent cell array A and the resulting matrix after using sortrows on the matrix contained in the first cell of A.
  1 个评论
Stephen23
Stephen23 2017-3-9
@Digvijay Rawat: please upload files here and do not put links to third-party websites. To upload click the paperclip button above the textbox.

请先登录,再进行评论。

回答(1 个)

dpb
dpb 2017-3-9
编辑:dpb 2017-3-10
No .mat file attached, try again.
I'd venture that whassup is floating point roundoff so that all those "identical" values aren't actually precisely identical and the ordering is what it is because
x-2*eps < x-eps < x <x+eps < x+2*eps < ... <x+N*eps
where eps is rounding difference for the values x in the column. Try subtracting the base value of the first column value against which you're sorting for the areas around row 74 where the anomaly occurs and report the result. I suspect you'll then understand the reason; perhaps you need to round or use a tolerance on determining the unique values within that tolerance for the first column and then classify by it then sort by subsequent columns to get the ordering expected. Or, perhaps this really is the expected ordering if the precise values are needed! :)
ADDENDUM
Using your data file
>> fprintf(['%15.12f' repmat('%8.4f',1,3) '\n'],sorted(74:80,:).')
0.020000000000 -0.0110 0.0000 0.0000
0.020000000000 0.0110 0.0000 0.0000
0.020000000790 -0.0109 0.0000 -0.0539
0.020000000790 -0.0108 0.0000 -0.0929
0.020000000790 -0.0106 0.0000 -0.1183
0.020000000790 -0.0105 0.0000 -0.1321
0.020000000790 -0.0104 0.0000 -0.1357
>>
makes it easier to see the "why"...as the various other things looked at in Comment show, there are two values that are nominally 0.2 in the dataset and the data are sorted on those values correctly.
CONCLUSION
To fix the problem,
>> length(unique(A1(:,1))) % review...how many in first column unique?
ans =
31
>> A1(:,1)=round(A1(:,1),3); % clean up the first column to 3 decimal places
>> length(unique(A1(:,1))) % and after that, only half as many
ans =
16
>> s=sortrows(A1); % now sort and see what "bad" range looks like
>> s(74:80,:)
ans =
0.0200 -0.0110 0 0
0.0200 -0.0109 0 -0.0539
0.0200 -0.0108 0 -0.0929
0.0200 -0.0106 0 -0.1183
0.0200 -0.0105 0 -0.1321
0.0200 -0.0104 0 -0.1357
0.0200 -0.0102 0 -0.1310
>>
Voila! What you were expecting to see at first...
  2 个评论
Digvijay Rawat
Digvijay Rawat 2017-3-9
Hey, file attached now. eps should not be an issue here since the values that are being mixed up are of different sign altogether.
dpb
dpb 2017-3-9
编辑:dpb 2017-3-9
The order for those is controlled by the order for the first column, though..
>> load problemo.mat
>> A1=A{1};
>> u=unique(A1(:,1)),;
u =
0
0.0200
0.0200
0.0400
0.0400
0.0600
...
0.2400
0.2400
0.2600
0.2600
0.2800
0.2800
0.3000
0.3000
>> min(diff(u))
ans =
7.9000e-10
>>
Note there are doubled-up values for each of the nominal values with minimum difference shown above; rest is probably about the same.
Alternatively,
>> sorted(74:80,:) % your "problem" area...
ans =
0.0200 -0.0110 0 0
0.0200 0.0110 0 0
0.0200 -0.0109 0 -0.0539
0.0200 -0.0108 0 -0.0929
0.0200 -0.0106 0 -0.1183
0.0200 -0.0105 0 -0.1321
0.0200 -0.0104 0 -0.1357
>> sorted(75,1)<sorted(76,1) % what's the relationship between these 2?
ans =
1
>> diff(sorted(74:80,1))>0
ans =
0
1
0
0
0
0
>>
What the above means is the 2nd is same as first; both of those are less than (albeit only slightly) third and then the rest are identical to that one over this subset. This same effect will happen at every one of the above matched pairs returned by unique; whether the 2nd column is sorted over the entire group for each group will then depend solely on the luck of the draw as to whether they happen to fall in the correct order already; for this set that didn't happen.
BUT the one that appears out of order with values in column 2 is in correct order based on the higher-priority sorting of column 1. As noted in Answer, if you want this to go away, you'll have to fixup the first column values to not have the discrepancy in values that causes their ordering in natural order.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Shifting and Sorting Matrices 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by