Replacing certain values in table with NaN based on different conditions

95 次查看(过去 30 天)
Hello everyone,
I'm basically new to Matlab and seem to be at a loss with my current problem - so any help would be greatly appreciated.
I have some data stored in a table called T that is 724 rows x 365 variables in size.
Now for some variables, I need to exclude extreme values based on different conditions before continuing on with my statistical analysis. But since these don't warrant excluding the whole dataset (row), I would like to replace those values with NaN in my table T.
As of now, this concerns only the variable in column 129, were values should be replaced with NaN if they are equal to or less than 3.1006 as well as column 131 were values should be replaced with NaN if they are equal to or less than 4.8673.
I have tried to get this to work for hours with several variations of logical indexing and am now stuck at this:
extremevalues = (T{:,129} <= 3.1006) | (T{:,131} <= 4.8673);
T(extremevalues) = NaN;
This does not give me any errors - which is the farthest I've come so far - however, it does not replace those extreme value in the respective columns with NaNs (there are no NaNs to be seen in either one of the specified columns in T, extremevalues however gives me several "1"s when quickly scrolling through it).
I've also tried substituting T{:, variablenr} with T.variablename - that didn't change anything either though.
I feel like the solution should be pretty easy, but I can't seem to find it (or am simply not experienced enough to find it).
Thanks very much in advance!

回答(2 个)

darova
darova 2021-8-16
Maybe you need to convert your data into matrix first
A = randi(10,5,4)
A = 5×4 table
Var1 Var2 Var3 Var4 ____ ____ ____ ____ 7 8 8 2 1 2 1 5 3 7 1 7 1 5 5 7 2 10 8 4
ind = A(:,1) < 5; % have this error when using 'table'
Operator '<' is not supported for operands of type 'table'.
A(ind,:) = nan
  1 个评论
Liza M
Liza M 2021-8-17
编辑:Liza M 2021-8-17
I think that would've worked if my table weren't of a mixed data type (there are other variables besides numeric ones in there as well).
Anyways, I just discovered the rmoutliers function, which does the thing I need with the specified variables..in just a lot less complicated way than what I've tried.
Thanks again regardless for your quick help!

请先登录,再进行评论。


Ive J
Ive J 2021-8-17
For table, you need to directly access the variable names/column indices. so, T(extremevalues) = NaN; won't work.
This may help:
t = splitvars(table(reshape(1:20, 5, 4)))
t = 5×4 table
Var1_1 Var1_2 Var1_3 Var1_4 ______ ______ ______ ______ 1 6 11 16 2 7 12 17 3 8 13 18 4 9 14 19 5 10 15 20
% set values below 8 in column 2 to nan
colN = 2;
cutoff = 8;
t.(colN)(t.(colN) < cutoff) = nan
t = 5×4 table
Var1_1 Var1_2 Var1_3 Var1_4 ______ ______ ______ ______ 1 NaN 11 16 2 NaN 12 17 3 8 13 18 4 9 14 19 5 10 15 20

类别

Help CenterFile Exchange 中查找有关 NaNs 的更多信息

标签

产品


版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by