Recode Categorical Variable into New Binary Variables

版本 1.0.0.0 (3.3 KB) 作者: Brian Weidenbaum
Adds N-1 binary (effect or dummy coded) variables based on a categorical variable to your dataset.
288.0 次下载
更新时间 2012/4/5

查看许可证

% OUTPUT
%Returns your dataset with N-1 binary variables recoded from a
%categorical var that has N categories.
%The Nth category isn't included as a distinct variable (will be represented by all 0's or all -1's, depending on recoding_type)
%So list the least important category last in the category_values parameter
%Also (optionally) drops the original categorical var.
%
% Works best when the variable contains chars or numeric/logical values
%
%
% INPUTS (** = optional)
% If you plan to omit an optional param, you must also omit all the params that follow it
%
% 1) dataset - (dataset) the actual dataset variable
%
% 2) variable_name - (char or number) name/column number of a categorical variable in dataset
%
% 3) category_values** - (cell vector of chars/numbers/logicals, or a vector
% of numbers/logicals) names of cats in the var (default=unique(dataset.variable))
% MUST ONLY INCLUDE LEGAL CHARACTERS FOR NAMING CONVENTIONS
% e.g. no '?' or '!' or '%' involved in any category values
%
% 4) recoding_type** - (char) 'dummy' or 'effect' (case insensitive).
% Dummy creates 0,1 variables, with all 0's representing the Nth category,
% Effect creates -1,1 variables, with all -1's representing the Nth category
% default = 'dummy'
%
% 5) drop_original** - (logical) whether to drop the original un-recoded variable from the dataset (default=false)
%
% 6) separator** - (char) the char string to put inside the new varname,
% between the name of the original variable and the category value
% default = '_'
% e.g., by default, a dummy variable that represents the category 'T' in
% Var1 will be named 'Var1_T'
% MUST ONLY INCLUDE LEGAL CHARACTERS FOR NAMING CONVENTIONS
% e.g. no '?' or '!' or '%' involved in the separator
%
%
% EXAMPLES
%if dataset Exam_Data's variable c1 has 2 cats: 'a' and 'b',
%you would type:
% Exam_Data = categorical2bins(Exam_Data,'c1',{'a','b'});
%and the function would do the following:
% Exam_Data.c1_a = zeros(nrows,1);
% Exam_Data.c1_a(strcmp(Exam_Data.c1,'a'))=1;
%
% %if 3 cats: a b and c (listed in that order in category_values)
%you would type:
% Exam_Data = categorical2bins(Exam_Data,'c1',{'a','b','c'});
%and the function would do the following:
% Exam_Data.c1_a= zeros(nrows,1);
% Exam_Data.c1_a(strcmp(Exam_Data.c1,'a'))=1;
% Exam_Data.c1_b= zeros(nrows,1);
% Exam_Data.c1_b(strcmp(Exam_Data.c1,'b'))=1;
%
% Also works for numbers
% If 3 cats: 1,2, and 3 (doubles)...
%you would type:
% Exam_Data = categorical2bins(Exam_Data,'c1',{1,2,3});
% OR
% Exam_Data = categorical2bins(Exam_Data,'c1',[1,2,3]);
%and the function would do the following:
% Exam_Data.c1_1= zeros(nrows,1);
% Exam_Data.c1_1(Exam_Data.c1==1)=1;
% Exam_Data.c1_2= zeros(nrows,1);
% Exam_Data.c1_2(Exam_Data.c1==2)=1;
%

引用格式

Brian Weidenbaum (2025). Recode Categorical Variable into New Binary Variables (https://ww2.mathworks.cn/matlabcentral/fileexchange/36042-recode-categorical-variable-into-new-binary-variables), MATLAB Central File Exchange. 检索时间: .

MATLAB 版本兼容性
创建方式 R2011b
兼容任何版本
平台兼容性
Windows macOS Linux
类别
Help CenterMATLAB Answers 中查找有关 Categorical Arrays 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
版本 已发布 发行说明
1.0.0.0