floating-point arithmetic

It is obvious that
|0.1234-0.123| = 0.0004
however, the following Matlab result is a little bit different!
>> 0.1234-0.123
ans =
3.999999999999976e-04
I think this happens because the IEEE754 conversion of 0.1234 and 0.123 have infinite digits.
Does anyone have more explanation?

回答(2 个)

I think this happens because the IEEE754 conversion of 0.1234 and 0.123 have infinite digits.’
Correct.
Does anyone have more explanation?
I doubt it. Your explanation about covers it. (The relevant MATLAB documentation is in Floating-Point Numbers.)
.
EDIT — (12 Nov 2022 at 11:44)
In double precision we can trust on about 15 decimal digits.In double precision we can trust on about 15 decimal digits.
See the eps function for details —
format long
FPP = eps
FPP =
2.220446049250313e-16
FPP = 1/eps
FPP =
4.503599627370496e+15
FPPlog2 = log2(eps)
FPPlog2 =
-52
FPPlog2 = log2(1/eps)
FPPlog2 =
52
.
"... the IEEE754 conversion of 0.1234 and 0.123 have infinite digits ..."
Not sure what you mean by this statement. Neither of these numbers can be represented exactly in IEEE double precision floating point which of course uses a finite number of binary bits. Maybe you mean an exact conversion of these specific decimal numbers to binary would require an infinite number of binary bits?
The exact binary-to-decimal conversions of the numbers that are stored (the closest representable values to the decimal numbers above) are:
fprintf('%.56f',0.1234)
0.12339999999999999580335696691690827719867229461669921875
fprintf('%.52f',0.123)
0.1229999999999999982236431605997495353221893310546875
fprintf('%.56f',0.1234 - 0.123)
0.00039999999999999757971380631715874187648296356201171875

8 个评论

IEEE 754 effectively uses INTEGER/PowerOf2 as the internal representation. It happens that every such value converts to a finite representation if you convert back to base 10. Not an "infinite" number of digits.
The maximum number of digits in the base 10 representation is roughly 762, for the case of eps(realmin)
E.g., this smallest denormalized number:
s = sprintf('%750.750e',typecast(uint64(1),'double'))
s = '4.940656458412465441765687928682213723650598026143247644255856825006755072702087518652998363616359923797965646954457177309266567103559397963987747960107818781263007131903114045278458171678489821036887186360569987307230500063874091535649843873124733972731696151400317153853980741262385655911710266585566867681870395603106249319452715914924553293054565444011274801297099995419319894090804165633245247571478690147267801593552386115501348035264934720193790268107107491703332226844753335720832431936092382893458368060106011506169809753078342277318329247904982524730776375927247874656084778203734469699533647017972677717585125660551199131504891101451037862738167250955837389733598993664809941164205702637090279242767544565229087538682506419718265533447265625e-324'
numel(s)
ans = 757
An easier way to compute that number uses eps.
format hex
eps(0)
ans =
0000000000000001
typecast(uint64(1), 'double')
ans =
0000000000000001
format longg
eps(0)
ans =
4.94065645841247e-324
s1 = sprintf('%.760g', eps(realmin) )
s1 = '4.940656458412465441765687928682213723650598026143247644255856825006755072702087518652998363616359923797965646954457177309266567103559397963987747960107818781263007131903114045278458171678489821036887186360569987307230500063874091535649843873124733972731696151400317153853980741262385655911710266585566867681870395603106249319452715914924553293054565444011274801297099995419319894090804165633245247571478690147267801593552386115501348035264934720193790268107107491703332226844753335720832431936092382893458368060106011506169809753078342277318329247904982524730776375927247874656084778203734469699533647017972677717585125660551199131504891101451037862738167250955837389733598993664809941164205702637090279242767544565229087538682506419718265533447265625e-324'
s2 = sprintf('%.760g', eps(0))
s2 = '4.940656458412465441765687928682213723650598026143247644255856825006755072702087518652998363616359923797965646954457177309266567103559397963987747960107818781263007131903114045278458171678489821036887186360569987307230500063874091535649843873124733972731696151400317153853980741262385655911710266585566867681870395603106249319452715914924553293054565444011274801297099995419319894090804165633245247571478690147267801593552386115501348035264934720193790268107107491703332226844753335720832431936092382893458368060106011506169809753078342277318329247904982524730776375927247874656084778203734469699533647017972677717585125660551199131504891101451037862738167250955837389733598993664809941164205702637090279242767544565229087538682506419718265533447265625e-324'
Huh, I didn't know that eps(0) could be used !
Thank you.
The question is about the representation error in IEEE 754 format, the source of this error is due to the finite number of bits used in this number system.
It seems that we have to use the result like
fprintf('%.52f',0.123)
with a care, e.g., in double precision we have 64 bits where 52 bit is for mantissa, so the conversion by fprintf() contains an error (representation error). In double precision we can trust on about 15 decimal digits.
On MacOS fprintf will give exact results when asked to display enough digits. Historically on Windows it displayed 0s after around 16 digits, and historically, on Linux it got more digits but only roughly 30.
The change in the background library code for Windows MATLAB was R2017b. Prior to that it displayed trailing 0s after about 16 digits, but from R2017b onwards it displays exact conversion when enough digits are requested. This is what motivated my num2strexact( ) FEX submission many years ago.

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Operators and Elementary Operations 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by