aggregate data of a dataset
2 次查看(过去 30 天)
显示 更早的评论
采纳的回答
Sindar
2020-2-16
check out splitapply. You may need to change the format of your data, but it does exactly what you want:
G = findgroups(ds.seats);
mean_dist = splitapply(@mean,ds.score,G);
Switching to tables is probably a good idea:
ds = readtable("datasetT.csv");
17 个评论
Sindar
2020-2-16
I'm not familiar with R, but (based on a little googling of R's aggregate function) it looks like splitapply does basically the same thing, just with a little less in the way of wrapping. Look at the documentation for examples.
Megan
2020-2-16
Okay I did dataset2table. That worked out. Now I have a table
but
splitapply
didn't work.
Do you know why? Now I know it's not because of dataset.
Sindar
2020-2-16
Most likely, you have NaN's in your data. Sounds like you'll need to do some extra work (but, this will help in the future). First, try using the import tool: https://www.mathworks.com/help/matlab/ref/importtool-app.html
This should allow you to figure out why readtable isn't working. Once everything looks good, you can generate code using the arrow just under "import selection"
Then, look here for how to handle missing data (that produced those nans). Some can be done during import, too. https://www.mathworks.com/help/matlab/data_analysis/missing-data-in-matlab.html
Megan
2020-2-16
fillmissing(ds,'constant',0)
This is not working.
Error using fillmissing/checkArrayType (line 522)
Invalid fill constant type.
Error in fillmissing/fillTableVar (line 166)
[intConstVj,extMethodVj] = checkArrayType(Avj,intMethod,intConstVj,extMethodVj,x,true);
Error in fillmissing/fillTable (line 144)
B.(vj) =
fillTableVar(indVj,A.(vj),intMethod,intConst,extMethod,x,useJthFillConstant,useJthExtrapConstant);
Error in fillmissing (line 127)
B = fillTable(A,intM,intConstOrWinSize,extM,x,dataVars);
Sindar
2020-2-16
Sorry, I haven't actually used fillmissing much, so I'm not sure what's up. Regardless, I realized removing rows with missing entries is probably better for your purpose:
ds=readtable('datasetT.xlsx');
clean_ds = rmmissing(ds);
G = findgroups(clean_ds.Seat);
mean_dist = splitapply(@mean,clean_ds.score,G);
Megan
2020-2-16
That worked out well Thanks!!!
One last question: now I have two rows with mean values.
How can I know which row is which seat number?
Sindar
2020-2-16
Look at the second output from findgroups:
[G,G_seat] = findgroups(clean_ds.Seat);
At the end, you can make a summary table:
sum_table = table(G_seat,mean_dist)
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Data Preprocessing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)