Precision of numbers problem (I think)

Question

Phillip 2021-10-1

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1464584-precision-of-numbers-problem-i-think

评论： Phillip 2021-10-1

new_data.mat

I have a matrix with 4 columns. I'd like to do something very simple, which is average all numbers in Column 4 given specific numbers in Column 2 but I end up with NaNs for 2 of the 7 means with the code below.

I have a sample "new_data.mat" attached.

This is the Matlab script I'm trying to get to work (rounding alone has not solved the problem):

clear all; clc; close all

alldir = dir(['SS*']);

P = pwd;

for d = 1:size(alldir,1)

PathName = [P, '\', alldir(d).name, '\Behaviour\'];

cd(PathName)

FileName = dir('ID*_PRIM_beh_all.mat');

new_data = cell2mat(struct2cell(load(FileName.name))); % example in my dropbox linked above

phys_diffs = 0:.15:.90;

for k = 1:7

step = phys_diffs(k);

avgdiffRT(k) = mean(new_data(new_data(:, 2) == step, 4));

end

alldiffRT(d, :) = avgdiffRT;

end

If I use the function "round" and take the loop out, i.e. average each of the 7 steps "manually" it works:

rounded = round(new_data(:, 2), 2);

avgRTphysdiff1 = mean(new_data(rounded == 0, 4));

avgRTphysdiff2 = mean(new_data(rounded == .15, 4));

etc

Does somebody have an idea how to get the loop to work?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Stephen23 2021-10-1

编辑：Stephen23 2021-10-1

在 MATLAB Online 中打开

"Does somebody have an idea how to get the loop to work?"

Do NOT use ROUND. It is not a robust approach, and should be avoided.

Nor should you compare for exact equivalence of floating point numbers, e.g. using EQ or ISMEMBER.

The simple, robust, recommended approach is to compare the absolute difference against a tolerance:

tol = 1e-5; % pick the tolerance to suit your needs.
abs(A-B)<tol

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Konrad 2021-10-1

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1464584-precision-of-numbers-problem-i-think#answer_799324

在 MATLAB Online 中打开

Hi Phillip,

yes, it has to do with precision.

ismember(phys_diffs,new_data(:,2))
% ans =
%   1×7 logical array
%    1   0   1   1   1   1   0

This shows that the 2nd and the last value in phy_diffs (.15 and .9) do not appear in new_data(:,2). (And the mean of nothing is NaN)

But values very close to these two numbers do appear. So what you could do is to round new_data(:,2), e.g. to two digits:

idx = round(new_data(:, 2),2) == step;
avgdiffRT(k) = mean(new_data(idx, 4));

Regards, Konrad

3 个评论
显示 1更早的评论隐藏 1更早的评论

John D'Errico 2021-10-1

编辑：John D'Errico 2021-10-1

在 MATLAB Online 中打开

Note that rounding a number to 2 significant digits does NOT produce an exact value.

X = 0.15;

Is X exactly 0.15? Of course not. MATLAB CANNOT represent that number, which can be thought of in fractional form as 15/100, or 3/20 in lowest terms.

sprintf('%0.55f',X)
ans = '0.1499999999999999944488848768742172978818416595458984375'

Will rounding to two digits help?

Y = round(X,2)
Y = 0.1500
sprintf('%0.55f',Y)
ans = '0.1499999999999999944488848768742172978818416595458984375'

As you can see, the result is unchanged. This is because no matter how hard you try, MATLAB cannot store many such fractions in their exact form. This happens for the same reason why you cannot represent 2/3 as a decimal. To do so exactly would require storing infinitely many decimal digits, since when represented in decimal form, we have:

2/3 = 0.666666666666666666666666666666666666666...

But why is that a problem with the fraction 3/20? Or even 1/10?

The problem arises because computers use binary storage methods to represent numbers. So if we look at how 3/20 is stored, we would see this expansion

0.0010011001100110011001100110011001100110011001100110011...

In that expansion, the ones represent negative powers of 2.

P = [-3 -6 -7 -10 -11 -14 -15 -18 -19 -22 -23 -26 -27 -30 -31 -34 -35 -38 -39 -42 -43 -46 -47 -50 -51 -54 -55];
format long g
sum(2.^P)
ans = 
                      0.15
sum(2.^P) - X
ans = 
     0

As you can see, that expansion does repreduce 3/20, as close as MATLAB can come. But to be exactly 3/20, the expansion needs to continue for an infinite number of terms. So 3/20 is a number without a finite binary expansion, just as 2/3 cannot be represented in a finite number of decimal digits. The pattern of 0's and 1's will repeat forever, and since MATLAB can use only a finite number of binary bits in the mantissa (52 of them) it can never represent most such fractions.

That is not to say ALL such fractions. Any (within limits) simple power of 2 will be exactly represented. That is, MATLAB will represent some fractions like 1/8, or 3/16 exactly. But these are either simple powers of 2, or they can be written as a sum of negative powers of 2.

sprintf('%0.55f',3/16)
ans = '0.1875000000000000000000000000000000000000000000000000000'

No problems there, as long as the denominator when written in lowest terms as a power of 2 (again within limits) MATLAB will succeed. For example, 2^-70 = 1/2^70.

sprintf('%0.150f',2^-70)
ans = '0.000000000000000000000847032947254300339068322500679641962051391601562500000000000000000000000000000000000000000000000000000000000000000000000000000000'

And in that number, there are now infinitely many trailing ZEROS in decimal form. In a binary form, we could think of that number like this

0.00000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000...

With only a single binary bit turned on.

Phillip 2021-10-1

Thank you very much for taking the time to write such a detailed explanation. Very useful!!!

请先登录，再进行评论。

Precision of numbers problem (I think)

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

回答（1 个）

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Precision of numbers problem (I think)

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

回答（1 个）

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论