huge differences in single vs double precision math

Question

Jonathan 2014-8-7

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/146603-huge-differences-in-single-vs-double-precision-math

回答： John D'Errico 2014-8-7

I am calculating a sum of squares in 32-bit FP precision (for comparison with a GPU algorithm, which isn't relevant here).

Here is the code:

Y=single((0:499).^2);
sum(Y)
ans =
   41541684
sum(double(Y))
ans = 
   41541750

The (correct) double answer is off by 66! The largest value, 499^2 = 249001, is nowhere near any FP limits.

This is R2013A on OS X 10.9.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

John D'Errico 2014-8-7

4
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/146603-huge-differences-in-single-vs-double-precision-math#answer_147687

在 MATLAB Online 中打开

What you don't understand is that single precision has a 23 bit mantissa. While there are 32 total bits stored in a single, don't forget that one of those bits is a sign bit, which leaves 8 bits to store an exponent in a biased form. So you cannot store an INTEGER larger than 2^24-1 in a single, if you wish to do so without error.

The sum you formed was larger than that limit, so you should expect an error.