How to separate table columns by groups?

1 次查看(过去 30 天)
If I have a table of data (T1) such as:
Var1 Var2
1 xyz
1 xyy
2 xxx
2 xzx
3 xyy
4 yzy
4 xzz
4 yzz
I would like to output another table (T2):
Var1 Var2
1 xyz,xyy
2 xxx,xzx
3 xyy
4 yzy,xzz,yzz
I have unsuccessfully used unstack, since it creates the values in Var2 as column names. Any suggestions would be much appreciated.
  3 个评论
Denis Pesacreta
Denis Pesacreta 2018-6-20
No it is not. It is categorical of the form 'xyz'.
Denis Pesacreta
Denis Pesacreta 2018-6-20
Var 1 is the output of a findgroups command, Var2 is categorical. Such as if Var1 was the output of a findgroups command on an ID variable, and Var2 was possessions (i.e person number one has a car and a house, while person number 4 may have a car, a house, and a computer)

请先登录,再进行评论。

采纳的回答

Guillaume
Guillaume 2018-6-20
If Var2 is a categorical array, then I assume that your xyz,xyy is actually a 1x2 categorical array [xyz, xyy], not a char array. If so:
T1 = table([1;1;2;2;3;4;4;4], categorical({'xyz';'xyy';'xxx';'xzx';'xyy';'yzy';'xzz';'yzz'}))
varfun(@(values) {values'}, T1, 'GroupingVariables', 'Var1')
  3 个评论
Guillaume
Guillaume 2018-6-20
No, that's the built-in disp of a table, you can't change that. tables are not really designed to hold arrays in columns. It's more designed to have scalar values in each column.
Peter Perkins
Peter Perkins 2018-7-3
Guillaume, you are correct that it's sometimes difficult to display multi-column variables in a table, but tables absolutely are designed to support them (and in fact it's the reason why "variables" in tables are called "variables", not "columns" in the doc). The display works as you would expect in cases with only a few columns in each variable.
>> table(rand(3,2),rand(3,5))
ans =
3×2 table
Var1 Var2
__________________ _______________________________________________________
0.15761 0.48538 0.42176 0.95949 0.84913 0.75774 0.65548
0.97059 0.80028 0.91574 0.65574 0.93399 0.74313 0.17119
0.95717 0.14189 0.79221 0.035712 0.67874 0.39223 0.70605
Actually, in your solution, the second variable in the result from varfun is in fact a cell array, each cell containing a categorical vector. There's nothing at all wrong with that, it's completely supported by tables, and in fact if the different groups have different numbers of rows in the original table it's pretty much necessary. But you are right that the display is not as informative as it could be in some cases.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Data Preprocessing 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by