How to convert categorical data to numeric in separate columns?
15 次查看(过去 30 天)
显示 更早的评论
% Hi! I have a dataset 'data5' with a column 'Location' which contains values Asia, US and Africa.
% I'm wanting to convert it to 3 separate columns, one for each location, which contains a 1 if the row is from that location and 0 otherwise
% This is the function I have created:
function data = categorical_values(data, var)
uniques = unique(var);
for i = 1:length(uniques)
values(:, i) = double(ismember(var, uniques(i)));
end
t = table;
[rows, cols] = size(values);
for i = 1:cols
t1 = table(values(:, i));
t1.Properties.VariableNames = uniques(i);
t = [t t1];
end
data = [t data];
end
% And this is the code I have been running, in a file called prep.m:
new = categorical_values(data5, data5.Location);
new.Location = []; % delete the old Location column
% I have been getting this error:
Error using categorical_values (line 11)
The VariableNames property is a cell array of character vectors. To
assign multiple variable names, specify names in a string array or a cell
array of character vectors.
Error in prep (line 16)
new = categorical_values(data5, data5.Location);
% Can anyone help??????? Thanks!
0 个评论
回答(1 个)
Adam Danz
2020-8-10
编辑:Adam Danz
2020-10-26
Here's a more efficient solution.
% Create demo data
location = categorical({'Asia','US','Asia','Africa','Africa','US','US','Asia'}');
unqCountries = unique(location(:)')
% Create matrix of 1s % 0s.
% Columns are identified by "unqCountries"
countryIdx = location(:) == unqCountries
% If you want to turn it into a table
T = array2table(countryIdx, 'VariableNames', string(unqCountries))
The error you're getting is because you're assigning a categorical variable as a table variable name which must be a character array or string. Convert to string:
t1.Properties.VariableNames = string(unique(i));
4 个评论
Adam Danz
2020-10-26
"Is this same as dummy coding or One Hot Encoding?"
The T table could be used as dummy variables and contains binary values (true|false) which is similar to using dummy variables in regression.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Type Conversion 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!