Precision of numbers problem (I think)

1 次查看(过去 30 天)
Phillip
Phillip 2021-10-1
评论: Phillip 2021-10-1
I have a matrix with 4 columns. I'd like to do something very simple, which is average all numbers in Column 4 given specific numbers in Column 2 but I end up with NaNs for 2 of the 7 means with the code below.
I have a sample "new_data.mat" attached.
This is the Matlab script I'm trying to get to work (rounding alone has not solved the problem):
clear all; clc; close all
alldir = dir(['SS*']);
P = pwd;
for d = 1:size(alldir,1)
PathName = [P, '\', alldir(d).name, '\Behaviour\'];
cd(PathName)
FileName = dir('ID*_PRIM_beh_all.mat');
new_data = cell2mat(struct2cell(load(FileName.name))); % example in my dropbox linked above
phys_diffs = 0:.15:.90;
for k = 1:7
step = phys_diffs(k);
avgdiffRT(k) = mean(new_data(new_data(:, 2) == step, 4));
end
alldiffRT(d, :) = avgdiffRT;
end
If I use the function "round" and take the loop out, i.e. average each of the 7 steps "manually" it works:
rounded = round(new_data(:, 2), 2);
avgRTphysdiff1 = mean(new_data(rounded == 0, 4));
avgRTphysdiff2 = mean(new_data(rounded == .15, 4));
etc
Does somebody have an idea how to get the loop to work?
  1 个评论
Stephen23
Stephen23 2021-10-1
编辑:Stephen23 2021-10-1
"Does somebody have an idea how to get the loop to work?"
Do NOT use ROUND. It is not a robust approach, and should be avoided.
Nor should you compare for exact equivalence of floating point numbers, e.g. using EQ or ISMEMBER.
The simple, robust, recommended approach is to compare the absolute difference against a tolerance:
tol = 1e-5; % pick the tolerance to suit your needs.
abs(A-B)<tol

请先登录,再进行评论。

回答(1 个)

Konrad
Konrad 2021-10-1
Hi Phillip,
yes, it has to do with precision.
ismember(phys_diffs,new_data(:,2))
% ans =
% 1×7 logical array
% 1 0 1 1 1 1 0
This shows that the 2nd and the last value in phy_diffs (.15 and .9) do not appear in new_data(:,2). (And the mean of nothing is NaN)
But values very close to these two numbers do appear. So what you could do is to round new_data(:,2), e.g. to two digits:
idx = round(new_data(:, 2),2) == step;
avgdiffRT(k) = mean(new_data(idx, 4));
Regards, Konrad
  3 个评论
John D'Errico
John D'Errico 2021-10-1
编辑:John D'Errico 2021-10-1
Note that rounding a number to 2 significant digits does NOT produce an exact value.
X = 0.15;
Is X exactly 0.15? Of course not. MATLAB CANNOT represent that number, which can be thought of in fractional form as 15/100, or 3/20 in lowest terms.
sprintf('%0.55f',X)
ans = '0.1499999999999999944488848768742172978818416595458984375'
Will rounding to two digits help?
Y = round(X,2)
Y = 0.1500
sprintf('%0.55f',Y)
ans = '0.1499999999999999944488848768742172978818416595458984375'
As you can see, the result is unchanged. This is because no matter how hard you try, MATLAB cannot store many such fractions in their exact form. This happens for the same reason why you cannot represent 2/3 as a decimal. To do so exactly would require storing infinitely many decimal digits, since when represented in decimal form, we have:
2/3 = 0.666666666666666666666666666666666666666...
But why is that a problem with the fraction 3/20? Or even 1/10?
The problem arises because computers use binary storage methods to represent numbers. So if we look at how 3/20 is stored, we would see this expansion
0.0010011001100110011001100110011001100110011001100110011...
In that expansion, the ones represent negative powers of 2.
P = [-3 -6 -7 -10 -11 -14 -15 -18 -19 -22 -23 -26 -27 -30 -31 -34 -35 -38 -39 -42 -43 -46 -47 -50 -51 -54 -55];
format long g
sum(2.^P)
ans =
0.15
sum(2.^P) - X
ans =
0
As you can see, that expansion does repreduce 3/20, as close as MATLAB can come. But to be exactly 3/20, the expansion needs to continue for an infinite number of terms. So 3/20 is a number without a finite binary expansion, just as 2/3 cannot be represented in a finite number of decimal digits. The pattern of 0's and 1's will repeat forever, and since MATLAB can use only a finite number of binary bits in the mantissa (52 of them) it can never represent most such fractions.
That is not to say ALL such fractions. Any (within limits) simple power of 2 will be exactly represented. That is, MATLAB will represent some fractions like 1/8, or 3/16 exactly. But these are either simple powers of 2, or they can be written as a sum of negative powers of 2.
sprintf('%0.55f',3/16)
ans = '0.1875000000000000000000000000000000000000000000000000000'
No problems there, as long as the denominator when written in lowest terms as a power of 2 (again within limits) MATLAB will succeed. For example, 2^-70 = 1/2^70.
sprintf('%0.150f',2^-70)
ans = '0.000000000000000000000847032947254300339068322500679641962051391601562500000000000000000000000000000000000000000000000000000000000000000000000000000000'
And in that number, there are now infinitely many trailing ZEROS in decimal form. In a binary form, we could think of that number like this
0.00000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000...
With only a single binary bit turned on.
Phillip
Phillip 2021-10-1
Thank you very much for taking the time to write such a detailed explanation. Very useful!!!

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Loops and Conditional Statements 的更多信息

产品


版本

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by