SVM with Dummy Variables

4 次查看(过去 30 天)
Context: I have a cell array with 19 features that are all categorical (nominal) (as columns) and ~1500 data entries (as rows). I've looped through all the columns and used double(dummyvar(nominal(featureVector))) to convert all the features into dummy variables (vectors of 1s & 0s) and all looks right.
Problem: When I try to feed this as the input data X to fitcsvm() it gives me an error as it expects X to be a floating point matrix.
Error using ClassificationSVM.prepareData (line 602)
You can pass only floating-point data for X to SVM.
If I convert the cell array into a matrix, then the dummy variable vectors will be represented as columns and thus they lose their identity as dummy variables as fitcsvm() expects each column to be a predictor in itself and now thinks there are (num of features)*(num of categories in each feature) predictors. So I don't see how I can use dummy variables with an SVM in Matlab which is mind boggling and I know this is a basic problem many will have.
Thanks so much for your help!

采纳的回答

Ilya
Ilya 2015-7-29
Just convert your cell array into a matrix. Yes, dummy variables will lose their identity in the sense that different levels of a categorical predictor will be treated as different predictors. This is common practice though.
  1 个评论
Melissa McCoy
Melissa McCoy 2015-8-13
Many thanks for your answer earlier!
Quick followup question - how then does sequential feature selection work? I've tried to implement it with sequentialfs() but obviously it doesn't realized that, for example, the first 3 columns actually refer to one feature and just takes the first column. I've posted my question here if helpful: http://www.mathworks.com/matlabcentral/answers/233803-sequentialfs-with-dummified-input-feature-matrix
Many thansk again!

请先登录,再进行评论。

更多回答(0 个)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by