convert categorical column to numeric in a matrix?

25 次查看(过去 30 天)
I have a matrix 147*26.I attached that here.some column are categorical data and I want convert them to numeric.what should I do? thanks for help

回答(3 个)

Yuvaraj Venkataswamy
  3 个评论
dpb
dpb 2018-8-17
Well, that's entirely different than being categorical.

请先登录,再进行评论。


dpb
dpb 2018-8-17
>> Stations = {'S1';'S2';'S1';'S3';'S2'};
Stations = categorical(Stations)
Stations =
5×1 categorical array
S1
S2
S1
S3
S2
>> double(Stations)
ans =
1
2
1
3
2
>>

dpb
dpb 2018-8-17
Totally different Q? in follow up than original --
You don't show how you "imported" the file, but
>> [~,~,r]=xlsread('tra2.xls');
>> cell2mat(r(2:10,2))
ans =
3
1
14
13
1
9
5
17
1
>>
From the raw cell array have to skip the header row to convert, hence the (2) as starting index.
Alternatively, and much easier to deal with in the end would be--
>> t=readtable('tra2.xls');
Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the VariableDescriptions property.
>> t(1:10,:)
ans =
10×26 table
Normalized_losses Make Fuel_type Aspiration Num_of_doors Body_style Drive_wheels Engine_location Wheel_base Length Width Height Curb_weight Engine_type Num_of_cylinders Engine_size Fuel_system Bore Stroke Compression_ratio Horsepower Peak_rpm City_mpg Highway_mpg Price Symboling
_________________ ____ _________ __________ ____________ ___________ ____________ _______________ __________ ______ _____ ______ ___________ ___________ ________________ ___________ ___________ ____ ______ _________________ __________ ________ ________ ___________ _____ _________
110 3 'gas' 'std' 'four' 'sedan' 'fwd' 'front' 96.5 163.4 64 54.5 2010 'ohc' 'four' 92 '1bbl' 2.91 3.41 9.2 76 6000 30 34 7295 '0'
134 1 'gas' 'std' 'two' 'hatchback' 'rwd' 'front' 98.4 176.2 65.6 52 2551 'ohc' 'four' 146 'mpfi' 3.62 3.5 9.3 116 4800 24 30 9989 '2'
125 14 'gas' 'std' 'four' 'sedan' 'fwd' 'front' 96.3 172.4 65.4 51.6 2405 'ohc' 'four' 122 '2bbl' 3.35 3.46 8.5 88 5000 25 32 8189 '1'
104 13 'gas' 'turbo' 'four' 'sedan' 'fwd' 'front' 99.1 186.6 66.5 56.1 2847 'dohc' 'four' 121 'mpfi' 3.54 3.07 9 160 5500 19 26 18620 '2'
91 1 'gas' 'std' 'four' 'sedan' 'fwd' 'front' 95.7 166.3 64.4 52.8 2140 'ohc' 'four' 98 '2bbl' 3.19 3.03 9 70 4800 28 34 9258 '0'
115 9 'gas' 'std' 'four' 'hatchback' 'fwd' 'front' 98.8 177.8 66.5 55.5 2425 'ohc' 'four' 122 '2bbl' 3.39 3.39 8.6 84 4800 26 32 11245 '0'
122 5 'gas' 'std' 'four' 'sedan' 'fwd' 'front' 94.5 165.3 63.8 54.5 1938 'ohc' 'four' 97 '2bbl' 3.15 3.29 9.4 69 5200 31 37 6849 '1'
121 17 'gas' 'std' 'two' 'hatchback' 'fwd' 'front' 88.4 141.1 60.3 53.2 1488 'l' 'three' 61 '2bbl' 2.91 3.03 9.5 48 5100 47 53 5151 '2'
134 1 'gas' 'std' 'two' 'hardtop' 'rwd' 'front' 98.4 176.2 65.6 52 2540 'ohc' 'four' 146 'mpfi' 3.62 3.5 9.3 116 4800 24 30 8449 '2'
93 15 'diesel' 'turbo' 'four' 'wagon' 'rwd' 'front' 110 190.9 70.3 58.7 3750 'ohc' 'five' 183 'idi' 3.58 3.64 21.5 123 4350 22 25 28248 '-1'
>>
>> t.Make(1:10)
ans =
3
1
14
13
1
9
5
17
1
15
>>
and Make will be double by default. I'd suggest converting it to categorical would probably be beneficial, though instead of numeric.
  4 个评论
z donyavi
z donyavi 2018-8-17
I use this command for import data
[~,~,raw] = xlsread('tra2.xls')
dpb
dpb 2018-8-17
编辑:dpb 2018-8-18
Well, that was the first response I showed you...I just did a few elements instead of the whole column. Replace (2:10,2) with (2:end,2). Again note you must manually avoid trying to do something you can't with the header row this way and will have to continue to do that every time you try to access anything.
If you were to forget about xlsread and the cell array in favor of readtable and a table you'll get along a lot faster going forward.
And, that kind of a variable really ought to be categorical, not numeric, as should a significant number of the others.
A subset of the first 10 lines, first six variable...of those four should be categorical assuming the losses are actually a measured or calculated response. and one that is categorical would make a lot of sense as being numerical altho if it is only for classification as is quite probable it could well be left as categorical, too...
>> t(1:10,1:6)
ans =
10×6 table
Normalized_losses Make Fuel_type Aspiration Num_of_doors Body_style
_________________ ____ _________ __________ ____________ ___________
110 3 'gas' 'std' 'four' 'sedan'
134 1 'gas' 'std' 'two' 'hatchback'
125 14 'gas' 'std' 'four' 'sedan'
104 13 'gas' 'turbo' 'four' 'sedan'
91 1 'gas' 'std' 'four' 'sedan'
115 9 'gas' 'std' 'four' 'hatchback'
122 5 'gas' 'std' 'four' 'sedan'
121 17 'gas' 'std' 'two' 'hatchback'
134 1 'gas' 'std' 'two' 'hardtop'
93 15 'diesel' 'turbo' 'four' 'wagon'
>>

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Legend 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by