sum results depending on array orientation

4 次查看(过去 30 天)
The issue of sum depending to array orientation is reported by @Matt J in this thread
To me the flaw is serious enough that it needs attention so I create a separate thread.
As you see in this demo code the sum of an array depends how the array is oriented, and one of the result is way off when n is large
n = 2^25;
x =single(128);
A = x+zeros(2,n,'single');
B = A.';
sAt = sum(B,1).' % correct sum result
sAt = 2×1
1.0e+09 * 4.2950 4.2950
s = x*n % expected value
s = single 4.2950e+09
sAt == s % fine
ans = 2×1 logical array
1 1
sA = sum(A,2) % big roundoff error
sA = 2×1
1.0e+09 * 2.1475 2.1475
sA == s
ans = 2×1 logical array
0 0
  4 个评论
Matt J
Matt J 2024-1-16
编辑:Matt J 2024-1-16
I think it's also instructive to show how breaking up the sum(A,2) into smaller block sums can mitigate the issue,
n = 2^25;
x =single(128);
A = x+zeros(2,n,'single');
Ar=reshape(A,2,2^10,[]);
sum(A,2) %running sum
ans = 2×1
1.0e+09 * 2.1475 2.1475
sum(sum( Ar , 2) ,3) %sum of partial sums
ans = 2×1
1.0e+09 * 4.2950 4.2950
However,
sum(Ar,[2,3]) %running sum!
ans = 2×1
1.0e+09 * 2.1475 2.1475
Xiaoxing Zhang
Xiaoxing Zhang 2024-6-6
移动:Matt J 2024-6-6
In my case
a=rand(10000,1);
sum(a)==sum(sort(a))
always returns false. I opend a ticket and the replay basically says these are expected round-off errors and one should use
(sum(a)-sum(sort(a)))<(10*eps)
instead.

请先登录,再进行评论。

回答(2 个)

Matt J
Matt J 2024-1-16
编辑:Matt J 2024-1-16
Now the inconsistent results is considered by TMW as a bug
Here is what they said,
After some investigation, I found out that the inconsistency you are seeing is actually a bug inside MATLAB, and I sincerely apologize for any trouble this bug has caused you....
Right now, the issue is that MATLAB loses a lot of precision, and we believe the intended behavior is "mean()" [Ed. sum()] returning the answer much closer to ...the correct result under this case.
Notice that they say that the answer should be much closer to correct, but do not committ that it should be precisely correct, even though it is an integer-valued operation. So, I assume that the intended behavior is only to reduce the chances of precision underflow, but to do so equally along all dimensions.

Hassaan
Hassaan 2024-1-16
编辑:Hassaan 2024-1-16
This issue is related to the limitations of floating-point precision and the way MATLAB handles summation along different dimensions. It is commonly known as the "loss of precision" problem when summing floating-point numbers.
When you sum along different dimensions, especially when dealing with large arrays, the order of summation can affect the result due to the limited precision of floating-point arithmetic. This is not specific to MATLAB but is a general concern in numerical computing.
In the case of your code, the issue arises when summing along the second dimension (sA = sum(A,2)). The large number of elements in the array and the order of summation can lead to significant roundoff errors.
Here's a modified version of your code with additional explanations:
n = 2^25;
x = single(128);
A = x + zeros(2, n, 'single');
B = A.';
sAt = sum(B, 1).'; % Sum along the first dimension
s = x * n;
% Check if the sum along the first dimension is close to the expected value
disp('Check for sum along the first dimension:');
disp(all(abs(sAt - s) < 1e-6)); % You might need to adjust the tolerance
sA = sum(A, 2); % Sum along the second dimension
disp('Check for sum along the second dimension:');
disp(all(abs(sA - s) < 1e-6)); % You might need to adjust the tolerance
Note:
The all(abs(sAt - s) < 1e-6) and all(abs(sA - s) < 1e-6) checks if the computed sum is close enough to the expected value, considering a tolerance (1e-6 in this case). You may need to adjust the tolerance based on your specific requirements.
In summary, this behavior is a consequence of the limited precision of floating-point arithmetic. It's important to be aware of such issues and carefully choose the order of operations when dealing with large arrays and floating-point numbers.
------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
Feel free to contact me.
  1 个评论
Bruno Luong
Bruno Luong 2024-1-16
编辑:Bruno Luong 2024-1-16
"When you sum along different dimensions, especially when dealing with large arrays, the order of summation can affect the result due to the limited precision of floating-point arithmetic."
I don't see clearly where the order plays here, the order of A along the second dimension is the same as the order of the B along the first dimension. Furthermore all elements of both arrays have identical value of 128.
It seems it makes more sense to me that the associativity (and not cummutativity, order) of the internal sum implementation, possibly linked with multi-threading handling when suming along the first and second dimension is different.
"This is not specific to MATLAB but is a general concern in numerical computing."
If by "This" you mean floating point round-off and lost of commutativity, then yes I agree. But here we are alting about consistentcy of the result with respect to array orientation. Then no this is MATLAB specific.
To me I suspect that a single thread summing can only handle contiguous memory one at the time, and thus cannot handle the case of summing the array A along the second dimenstion with a step of 2*sizeof(single) between two adjadcent elements. The current MATLAB design choice seems to be privilege speed but run into this consistency (admitly the edge case) and precision when handle sum with step > 1. It could be fixed by different sum implementation design, for instant make the thread able to handle different step than 1. An optional flag of sum de select the internal method is to me a best way to remove such flaw, and let user to decide what to do if desired.
disp(all(abs(sAt - s) < 1e-6)); % You might need to adjust the tolerance
In my example all are power of 2, so I don't need compare with tolerance for the case of sum(B,1) and sum(A,2) is way off that you need to use a tolerance larger than 2e9 (!) to make the result match.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Logical 的更多信息

产品


版本

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by