Recode Categorical Variable into New Binary Variables
% OUTPUT
%Returns your dataset with N-1 binary variables recoded from a
%categorical var that has N categories.
%The Nth category isn't included as a distinct variable (will be represented by all 0's or all -1's, depending on recoding_type)
%So list the least important category last in the category_values parameter
%Also (optionally) drops the original categorical var.
%
% Works best when the variable contains chars or numeric/logical values
%
%
% INPUTS (** = optional)
% If you plan to omit an optional param, you must also omit all the params that follow it
%
% 1) dataset - (dataset) the actual dataset variable
%
% 2) variable_name - (char or number) name/column number of a categorical variable in dataset
%
% 3) category_values** - (cell vector of chars/numbers/logicals, or a vector
% of numbers/logicals) names of cats in the var (default=unique(dataset.variable))
% MUST ONLY INCLUDE LEGAL CHARACTERS FOR NAMING CONVENTIONS
% e.g. no '?' or '!' or '%' involved in any category values
%
% 4) recoding_type** - (char) 'dummy' or 'effect' (case insensitive).
% Dummy creates 0,1 variables, with all 0's representing the Nth category,
% Effect creates -1,1 variables, with all -1's representing the Nth category
% default = 'dummy'
%
% 5) drop_original** - (logical) whether to drop the original un-recoded variable from the dataset (default=false)
%
% 6) separator** - (char) the char string to put inside the new varname,
% between the name of the original variable and the category value
% default = '_'
% e.g., by default, a dummy variable that represents the category 'T' in
% Var1 will be named 'Var1_T'
% MUST ONLY INCLUDE LEGAL CHARACTERS FOR NAMING CONVENTIONS
% e.g. no '?' or '!' or '%' involved in the separator
%
%
% EXAMPLES
%if dataset Exam_Data's variable c1 has 2 cats: 'a' and 'b',
%you would type:
% Exam_Data = categorical2bins(Exam_Data,'c1',{'a','b'});
%and the function would do the following:
% Exam_Data.c1_a = zeros(nrows,1);
% Exam_Data.c1_a(strcmp(Exam_Data.c1,'a'))=1;
%
% %if 3 cats: a b and c (listed in that order in category_values)
%you would type:
% Exam_Data = categorical2bins(Exam_Data,'c1',{'a','b','c'});
%and the function would do the following:
% Exam_Data.c1_a= zeros(nrows,1);
% Exam_Data.c1_a(strcmp(Exam_Data.c1,'a'))=1;
% Exam_Data.c1_b= zeros(nrows,1);
% Exam_Data.c1_b(strcmp(Exam_Data.c1,'b'))=1;
%
% Also works for numbers
% If 3 cats: 1,2, and 3 (doubles)...
%you would type:
% Exam_Data = categorical2bins(Exam_Data,'c1',{1,2,3});
% OR
% Exam_Data = categorical2bins(Exam_Data,'c1',[1,2,3]);
%and the function would do the following:
% Exam_Data.c1_1= zeros(nrows,1);
% Exam_Data.c1_1(Exam_Data.c1==1)=1;
% Exam_Data.c1_2= zeros(nrows,1);
% Exam_Data.c1_2(Exam_Data.c1==2)=1;
%
引用格式
Brian Weidenbaum (2025). Recode Categorical Variable into New Binary Variables (https://ww2.mathworks.cn/matlabcentral/fileexchange/36042-recode-categorical-variable-into-new-binary-variables), MATLAB Central File Exchange. 检索时间: .
MATLAB 版本兼容性
平台兼容性
Windows macOS Linux类别
标签
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!版本 | 已发布 | 发行说明 | |
---|---|---|---|
1.0.0.0 |