Converting Categorical Array/Table to Numerical

92 次查看(过去 30 天)
Hello!
I have a 23000x4 set of data that is a table (called temp). It is all numbers except for the times data was missing it was filled in with 'NA', therefore the table is categorial. I am looking to change it to a numerical and have all the 'NA' changed to 'NaN' so I can run max and min and those kinds of things on it without running into issues. Thank you for the help!
  12 个评论
Claire Hollow
Claire Hollow 2020-6-10
The data I need to use came from a climate station in a csv file, when I imported it it was a categorial because of its contents.
Adam Danz
Adam Danz 2020-6-10
What function did you use to import it? You can control the class of the data you're importing. A csv file doesn't control that for numeric values.

请先登录,再进行评论。

采纳的回答

Adam Danz
Adam Danz 2020-6-10
编辑:Adam Danz 2022-9-23
By far the best solution is to avoid representing numeric values as categorical values in the first place. If you can un-do that, that's the best solution.
If that cannot be done, here's how to convert categorical values that contain numeric values in a table.
Table T can contain mixed classes (some classes may cause errors).
This demo detects which columns of T contains values that can be converted to numers. It then creates an output table T_converted that contains the num-categorical-number columns of T and the categorical-number columns converted to numbers.
% Create demo table with a mix of stings, categorical numerals, and numbers
T = table(["A";"B";"C";"D";"E"], ...
categorical(randi(10,5,1)), ...
randi(10,5,1), ...
categorical(randi(10,5,1)), ...
'VariableNames', {'A','B','C','D'})
T = 5×4 table
A B C D ___ __ _ _ "A" 2 3 4 "B" 3 4 8 "C" 9 6 8 "D" 9 4 5 "E" 10 5 6
varfun(@class, T)
ans = 1×4 table
class_A class_B class_C class_D _______ ___________ _______ ___________ string categorical double categorical
% Determine which columns are categoricals
% NOTE: This assumes you want to convert all categorical table variables
% to numeric. Otherwise, additional column indexing will be needed.
iscat = varfun(@iscategorical, T,'OutputFormat','Uniform');
% Convert the categorical table variables to numeric
Tnum = array2table(str2double(string(T{:,iscat})), ...
'VariableNames', T.Properties.VariableNames(iscat));
% Create an updated table with the converted data and maintain
% original column order
T_converted = [T(:, ~iscat), Tnum];
[~,colorder] = ismember(T_converted.Properties.VariableNames, T.Properties.VariableNames);
T_converted(:,colorder)
ans = 5×4 table
A B C D ___ __ _ _ "A" 2 3 4 "B" 3 4 8 "C" 9 6 8 "D" 9 4 5 "E" 10 5 6
varfun(@class, T_converted)
ans = 1×4 table
class_A class_C class_B class_D _______ _______ _______ _______ string double double double
This answer was corrected on 9/23/22; thanks to Ahmed Rady for pointing out the problem
  2 个评论
Ahmed Rady
Ahmed Rady 2022-9-23
编辑:Ahmed Rady 2022-9-23
Hi Adam
Thanks for the code
But it seems that the numeric values in the original table were transformed.
I thought the aim is to only change the catergorical variables.
Adam Danz
Adam Danz 2022-9-23
Thanks @Ahmed Rady, I've updated the answer to correct the mistake.
My previous answer involved applying double() to the categorical-numbers which converts the categoricals into a grouping number rather than the numbers represented within the categories.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Data Type Conversion 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by