combining text and numeric matrices

My dataset has 6 predictors (all ordinal text values e.g. good, better, best) and 1 response (ordinal numeric value e.g. 1,2,3) column. When I’m trying to combine these into 7 columns for further classification study, I’m shown the following error ’ Error using horzcat Dimensions of matrices being concatenated are not consistent. ’ Any suggestion?

10 个评论

Can you share your data in a mat file and your code ?
When you talk about ordinal do you mean you are using categorical variables?
The matrix looks like the pic attached. I can convert the categorical variables(which are ordinal as well) into numerical values, but I find assigning an arbitrary numeric value to the text values, unjustified. At the same time, I find logistic regression wouldn't classify the matrix in its original form.
But it appears you would have to convert the inputs to numeric, but not the response variable
Walter Roberson,
It isn't working. Logistic regression app in Matlab wouldn't even identify matrix (XYO) on which I wish to employ logistic regression. I did exactly what you told.
First converted predictor text to numeric value(e.g. 1,2,3). Then,
XY=[Predictor1 Predictor2 Predictor3];
XY=num2cell(XY);
XYO=[XY Outcome];
If you use the mnrfit() routine then you would not convert XY to cell, and you would pass in the outcomes as the second parameter rather than building a single XYO matrix.
I did even this. It still won't accept response variable as cell value (text entries). It displays the following error.
Error using mnrfit (line 142) Inputs must be floats, namely single or double.
The R2017b documentation says that the Y may be categorical.
Oh! I have R2015b version. Walter Roberson, is there any other method you'd know of?
Response values, specified as a column vector or a matrix. Y can be one of the following:
  • An n-by-k matrix, where Y(i,j) is the number of outcomes of the multinomial category j for the predictor combinations given by X(i,:). In this case, the number of observations are made at each predictor combination.
  • An n-by-1 column vector of scalar integers from 1 to k indicating the value of the response for each observation. In this case, all sample sizes are 1.
  • An n-by-1 categorical array indicating the nominal or ordinal value of the response for each observation. In this case, all sample sizes are 1.

请先登录,再进行评论。

回答(1 个)

I am going to assume, that your predictors matrix is of type 'm x 6 Cell'.
temp = randi(3,10,6);
predictors = cell(10,6);
predictors(temp==1) = {'good'};
predictors(temp==2) = {'better'};
predictors(temp==3) = {'best'};
response = randi(3,10,1);
This results in:
predictors =
{'good' } {'good' } ...
{'better'} {'best' } ...
... ...
and
response =
1
2
...
When you want to combine them you, need to convert your numerical array 'response' into an cell array to match the type of 'predictors':
combined = [predictors, num2cell(response)];

1 个评论

Hey Kai Domhardt! Thank you. This is helpful.
However, I'm not able to perform logistic regression over the dataset. Can logistic regression be performed on ' combined' matrix that you've just generated?

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Text Analytics Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by