Index and select rows from table

Question

0 个投票

A table includes columns with numerical and categorical columns (see attachement)

How can I get multiple subtables based on the categories of a column (e.g., col5)?

Expected output: 4 different tables (table1includes rows with "h_001" in col5... table4 includes rows with "r_041" in col5)

I could not use the following option when the columns have categorical data:

https://ch.mathworks.com/matlabcentral/answers/481574-how-to-get-the-index-of-a-value-in-a-table

9 个评论
显示 7更早的评论隐藏 7更早的评论

Stephen23 2023-11-21

编辑：Stephen23 2023-11-21

"I don't see how ismember() would be more efficient. It would require at least 2 lines per categorie"

Whatever way you do it, processing every category individually will be inefficient (in terms of your time writing and/or runtime).

That is why I suggested ISMEMBER, so that you can do all categories at once. Three lines of code, done.

Do not store each category individually. That is not how MATLAB works, you need to learn how to use vectors, matrices, and arrays. Start by placing all of the categories into one array (e.g. a string array), then use one ISMEMBER call. Read the ISMEMBER documentation carefully.

Another option would be to use one of the JOIN family.

Cris LaPierre 2023-11-21

Do not confuse the number of lines of code with efficiency.

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Peter Perkins 2023-11-27

在 MATLAB Online 中打开

0 个投票

"I need to split the original data based on the categories of col5."

You probably do not want/need to do that. Take a look at the rowfun function. Write a function to do what you want with each subset of your data, then use rowfun to apply that function based on groups defined by col5.

t = table([1;1;1;2;2;3;3;3],rand(8,1),rand(8,1),VariableNames=["G" "X" "Y"])
t = 8×3 table
    G       X          Y   
    _    _______    _______

    1    0.50908    0.09642
    1    0.46628     0.9575
    1    0.99625    0.40089
    2    0.63605    0.70043
    2    0.81209    0.49475
    3    0.38157    0.51853
    3    0.74877    0.66036
    3    0.85878    0.91136
myFun = @(x,y) mean(x) - mean(y);
rowfun(myFun,t,GroupingVariables="G")
ans = 3×3 table
    G    GroupCount      Var3   
    _    __________    _________

    1        3           0.17227
    2        2           0.12648
    3        3         -0.033704

Lots of other functions, like groupsummary, similarly do not require you to split your data up. As others have said, that's usually a bad idea and unnecessary.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Index and select rows from table

9 个评论
显示 7更早的评论隐藏 7更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

产品

版本

标签

Community Treasure Hunt

Index and select rows from table

9 个评论 显示 7更早的评论 隐藏 7更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

产品

版本

标签

另请参阅

Community Treasure Hunt

9 个评论
显示 7更早的评论隐藏 7更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论