Creating Unique ID number for repeated individuals in a dataset (i.e. Panel data id)

10 次查看(过去 30 天)
Hi,
I'm trying to create an id number which is unique to the names of individuals. Some individuals appear two, three or four times in the sample if there is an observation for that individual in multiple years. i.e. this is panel data that does not yet have an identifier for each panel observation.
So I have N individuals with T time periods. The number of T time periods for each individual ranges from 1 to 4. I need a unique id number for each individual where the id number would simply repeat itself for each N over the range of T-time periods.
The names are string values. For example...set the attached photo

回答(1 个)

Walter Roberson
Walter Roberson 2017-7-23
You can use the three-output version of unique() on the cell array of character vectors containing the names (or on the array of string objects if you happen to have one of those instead.) The third output will be the index number into the first output, and so serves as the corresponding unique ID.
In some cases, someone with a similar need might want to use categorical arrays; with your relatively low repetitions, I think unique() is better suited for your particular purpose.

类别

Help CenterFile Exchange 中查找有关 Data Preparation Basics 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by