Creating numerical variables from categorical variables in an unbalanced dataset
2 次查看(过去 30 天)
显示 更早的评论
Hello there,
I would like to apply Random Forrest method in a highly unbalanced dataset that includes both numerical and categoorical variables.In order to improve my classification results, before applying the method for classification I thought to create synthtic datasets using the SMOTE and the ADASYN algorithm. However, both methods work only with numerical variables, therefore, I would like to ask if you have any suggestion regarding the way to transform my categorical variables into numerical ones.
With many thanks in advance for your help
0 个评论
采纳的回答
Lei Hou
2020-2-14
Hi Grigorios,
You can do something as the following.
catVar = categorical(["a" "b" "c" "b" "a"]);
numValue = [0.1 3 100]; % The order of numbers refers to the order of categories returned by categories(catVar)
numVar = numValue(catVar)
Hoping my solution helpful to you.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Probability Distributions 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!