possible Bug? or checking if OverFlow happened?

Question

Mohammad Abouali 2015-3-31

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/195487-possible-bug-or-checking-if-overflow-happened

评论： Mohammad Abouali 2015-4-1

I have a matrix of size (6472908 x 67) all single values. Different columns have different max/min (there are different variables.

So I calculate mean of each column using

avgData=mean(Data);

I am expecting the first value in avgData to be the mean of the first column. However, when I issue

avgData(1) - mean(Data(:,1))
ans =
   -100.9785

as you can see the output is not zero. So what is changed? The same thing happened if I convert everything to double.

If I do this for sum() the difference is even more. So, I was wondering if overflow is happening and how should I check if overflow has happened? lastwarn() returns nothing.

I am afraid the X is about 800MB and can't upload it here.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

John D'Errico 2015-3-31

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/195487-possible-bug-or-checking-if-overflow-happened#answer_173497

在 MATLAB Online 中打开

No. It is not a bug, but an artifact of operations that may be done in a different order due to the BLAS, or whatever scheme is used internally. NEVER assume that two distinct operations will do a given computation in the same order. It might conceivably reflect an issue of whether a double precision accumulator might be employed for vector input to sum (again, a choice probably made in the BLAS), but not for array input.

If the min and max values vary by such a large amount, that difference is trivial, essentially down in the least significant bits of the result, especially when you are summing millions of such elements.

You have not yet said what the total mean was either, so we cannot know how significant is the difference.

As for this being an overflow, that is not at all reasonable to assume. The numbers you have described are simply not large enough to cause overflow, AND if they did overflow, overflows in floating point result in inf, NOT a loss of precision.

realmax('single')
ans =
   3.4028e+38
realmax('single')*2
ans =
   Inf

The problem here is clearly an issue of bits lost at the low end, due to variation in the sequence of adds in these numbers.

You can test that claim by computing the mean in different sequences of your vector. For example, try this test several times:

mean(data(randperm(6472908),1))

then look at the differences.

As well, compare those differences to the size of the actual mean. How does that difference compare to eps for that same number?

2 个评论
显示无隐藏无

per isakson 2015-3-31

编辑：per isakson 2015-3-31

在 MATLAB Online 中打开

R2013b,64bit,Win7

With 'single' I get identical mean values; i.e. order doesn't affect the value

    >> R = randn([6472908,67],'single') + 12054;
    >> format hex
    >> mean(R(randperm(6472908),1))
    ans = 463f2105
    >> mean(R(randperm(6472908),1))
    ans = 463f2105
    >> mean(R(randperm(6472908),1))
    ans = 463f2105
    >> mean(R(randperm(6472908),1))
    ans = 463f2105

and with 'double' there is a "rounding error" in the two last hex-positions, i.e. order does affect the value

    >> R = randn([6472908,67],'double') + 12054;
    >> mean(R(randperm(6472908),1))
    ans = 40c78afff205ede5
    >> mean(R(randperm(6472908),1))
    ans = 40c78afff205edd3
    >> mean(R(randperm(6472908),1))
    ans = 40c78afff205eda7

Mohammad Abouali 2015-4-1

编辑：Mohammad Abouali 2015-4-1

在 MATLAB Online 中打开

regarding the over flow you are right. It is not that.

I did give the average/mean of the first column it is 12054.53 or 2155.51 using the two different method in single precision and it is something else (but close to these two) when using double precision.

I tried to generate the same thing using randomly generated numbers, but couldn't reproduce it.

What is for sure is that mean(Data) and mean(Data(:,1)) don't follow the same path or order in summing and division and it is causing too much difference in this case.

The funny thing is that if I do

avgData=mean(Data(:,1:10));
avgData(1)-mean(Data(:,1));

this time I get 0 as the difference.

Well, after thinking more about it, I kinda think if there was any bug in the mean() function they would have figure it out by now. If it was some new package or function, well then may be, but come on?! mean() function?! I don't know what I was thinking :D. So whatever the reason, it could't be a bug. This idea of the order make perfect sense and thanks for showing it too.

请先登录，再进行评论。

Answer 2

James Tursa 2015-3-31

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/195487-possible-bug-or-checking-if-overflow-happened#answer_173467

编辑：James Tursa 2015-3-31

How large is the average value? Is 100 close to eps of this average value? E.g., is the average value near 1e17? If so, this could just be rounding differences in the methods used. I.e., maybe the problem is split up differently for multi-threading if there are several columns of a matrix involved vs just one column.

"...The same thing happened if I convert everything to double..."

You get the exact same result? Or you get a similar result (i.e., something not "close" to zero.). It would not surprise me that in the background MATLAB uses double accumulators even if the input is single, which might explain the same result in single vs double.

EDIT:

Probably the better comparison would be eps of the sum, not eps of the average.

EDIT:

You might also consider using this FEX contribution from Jan Simon:

http://www.mathworks.com/matlabcentral/fileexchange/26800-xsum

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Mohammad Abouali 2015-3-31

编辑：Mohammad Abouali 2015-3-31

The average value is 12054.53 (or 2155.51), using single precision for both case.

Actually now that double checking the same process again in Double precision I am getting almost the same results (well the difference is -8.0345e-09, which is still not explainable), so definitely that big difference is due to single precision VS double precision.

Although the difference for double precision is not much, though the question remains: if mean(X) and mean(X(:,1)) perform the same procedure for getting the average of first column, (regardless of the precision) they should end up with the same results. mean(X(:,1))-mean(X(:,1)) is returning 0 for both case (double or single), which obviously it should (otherwise something was really wrong).

So, now that we get different results for mean(X) and mean(X(:,1)) for the first column (and remember both are done either as single or double), definitely there are some other thing happening here and they don't perform the same procedure for the first column. So, I still think this is a bug.

请先登录，再进行评论。

Answer 3

Image Analyst 2015-3-31

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/195487-possible-bug-or-checking-if-overflow-happened#answer_173476

Yes, this is a known issue. We've converted arrays from double to single to save on memory yet when calling mean(), the means may not be correct. Basically it's ignoring the later elements as you add them because the sum is so huge and a tiny value added onto a gigantic value basically does not get added because it's so small. Basically underflow . We contacted the Mathworks to ask them. The Mathworks knows about this and does not consider it a bug , but just a normal precision issue comparable to this issue.

3 个评论
显示 1更早的评论隐藏 1更早的评论

Mohammad Abouali 2015-3-31

编辑：Mohammad Abouali 2015-3-31

This is not a question of whether the mean is accurate or not. both procedure are done in either single or double precision. if the same procedure is done using single precision on mean(X) and mean(X(:,1)) they should end up to the same results (may be not as accurate as the results of double precision but the same wrong results). As I mentioned in previous comment even in double they end up having slightly different answer. (there is no random thing happening here, so I expect the same results, unless there is something here that I don't understand or if computer rounds off number randomly, which I don't think is the case, I mean randomly rounding off. I may still missing something here)

My best guess is that mean(X) and mean(X(:,1)) are not accessing the same numbers, or they use two completely different procedure that their round off error is different. In this problem there are a lot of numbers, (by the way the min and max ranges between 0.09 and 1612300, so it is really changing widely) and this different approach or (bug in summing what numbers are added) causes difference.

Roger Stafford 2015-4-1

在 MATLAB Online 中打开

"I expect the same results" <-- This is an assumption you should not make. Strictly speaking, even the associative and distributive laws of arithmetic are violated when computation is subject to round-off errors. When a series of numbers is added, the results can depend on how they are grouped together in the addition process. I like to give the following as an example which will usually give differing results even on a decimal calculator:

   3/14+(3/14+15/14) 
   (3/14+3/14)+15/14

In computing mean(Data(:,1)) one cannot assume that the algorithm used is identical to that used for the first column of mean(Data). It all depends on just how the programmers who did the coding decided to handle the two different situations. Perhaps it depends on what they ate for breakfast? The basic assumption is that if different results agree within round-off error with perfect results, then either is acceptable.

Mohammad Abouali 2015-4-1

You are right. The only thing that could change the roundoff error (here) is if the order of adding the numbers is changed. I was thinking that it would stay the same, but obviously in this two cases it is not.

My impression was that in both case it is summing 3/14+(3/14+15/14); so the order is not changed for the first column between the two commands, i.e. sum(X) and sum(X(:,1)). In that case the results should have been the same. So, now that it is not the same, the most possible explanation so far is exactly this order thing. So one does 3/14+(3/14+15/14) and another does (3/14+3/14)+15/14.

请先登录，再进行评论。

possible Bug? or checking if OverFlow happened?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（2 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

possible Bug? or checking if OverFlow happened?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（2 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论