How can I group data points together
2 次查看(过去 30 天)
显示 更早的评论
I need to "bin" data points together into one number. This file has 70,000 lines of data points, in unevenly spaced, often repeating increments. For example, I need to average all the different numbers (2.419, 2.417, 2.405, etc...) with decimals into 2.000.
4 个评论
Rik
2020-12-2
I need to "bin" data points together into one number. This file has 70,000 lines of data points, in unevenly spaced, often repeating increments. For example, I need to average all the different numbers (2.419, 2.417, 2.405, etc...) with decimals into 2.000.
Matthew Suddith
2020-12-2
Thank you! I meant to edit it but I completely deleted it and couldnt figure out how to get it back.
回答(1 个)
Cris LaPierre
2020-11-30
It doesn't sound like you want to average them. For the example you've given, why not just round the numbers down to 2? The functions round, ceil, floor and fix might be of interest to you.
vals = [2.419, 2.417, 2.405];
round(vals)
ans = 1×3
2 2 2
ceil(vals)
ans = 1×3
3 3 3
●
floor(vals)
ans = 1×3
2 2 2
●
fix(vals)
ans = 1×3
2 2 2
35 个评论
Matthew Suddith
2020-11-30
The numbers are the depth at which the measurements were taken, and have associated calculations with them in 30 columns. I need to average each different depth, and its associated calculations, into a single depth so that I have one value at each depth. So 2.4100 m, 2.400m, etc is 2.000 m
Cris LaPierre
2020-11-30
编辑:Cris LaPierre
2020-11-30
For those of us not familiar with your data, how would we know what values to use?
Matthew Suddith
2020-11-30
What values do you need, I need to a function to average each decimal depth into a single 2.000 depth
Cris LaPierre
2020-11-30
I would think a better undertanding of what your data looks like would greatly faciliate proposing a solution. You can attach a sample of your data using the paperclip icon.
Absent that, I would encourage you to look at the documentation for groupsummary. You can average your data by groups you specify.
Matthew Suddith
2020-11-30
Here is a small snapshot of the data file. The depths range to the 700s, then return all the way back to the starting depth
Cris LaPierre
2020-11-30
编辑:Cris LaPierre
2020-11-30
Ok, so for the solution I'm thinking about to work, i would need to create an "actual depth" column that would take your values and bin them to the corresponding actual value. Could you tell us what the actual depths should be? How much does the values in the file vary from the actual depths?
Any chance you can upload an actual text file of your data? I'm not feeling motivated enough to transcribe the values in the png. I believe the upload has to be less that 5MB, so you can delete some rows if necessary. It would be nice to see at least a couple of the different depths in the file.
Matthew Suddith
2020-11-30
Here's the problem, the actual text file is 70,000 lines, so attached is a small sample of depths 1.988 to 6.811. So then, the final product for those depths would need to look like the finalproduct txt file
Matthew Suddith
2020-11-30
The txt file has that many points for each depth, all the way up to 750, then it repeats in descending order back to the first depth. Its a CTD file, it collects data at each depth as it is lowered into the ocean, then again on the way up
Matthew Suddith
2020-11-30
Open this file instead for the sample, that previously linked file looks strange
Cris LaPierre
2020-11-30
Ok, so now it's just about developing an algorithm for turning recorded depth into standard depths. Since no guidance has been given on how to do that, I defer to my original answer. Here's some sample code that uses the round function.
data = readtable("sample.txt");
data.Properties.VariableNames(1) = "Depth";
% Create groups by rounding the depths to integer values
data.grpDepth = round(data.Depth);
newData = groupsummary(data,"grpDepth","mean")
newData = 6x31 table
grpDepth GroupCount mean_Depth mean_Var2 mean_Var3 mean_Var4 mean_Var5 mean_Var6 mean_Var7 mean_Var8 mean_Var9 mean_Var10 mean_Var11 mean_Var12 mean_Var13 mean_Var14 mean_Var15 mean_Var16 mean_Var17 mean_Var18 mean_Var19 mean_Var20 mean_Var21 mean_Var22 mean_Var23 mean_Var24 mean_Var25 mean_Var26 mean_Var27 mean_Var28 mean_Var29
________ __________ __________ _________ _________ _________ _________ _________ _________ _________ _________ __________ __________ __________ __________ __________ __________ ___________ ___________ __________ __________ __________ __________ __________ __________ __________ __________ __________ __________ __________ ___________
2 38 2.0818 32.252 14.397 0.13886 1.014 5.9752 260.6 101.96 5.8557 255.39 213.7 0.00020461 2.115 32.252 14.397 6.5789e-05 -0.00016055 246.76 39337 37267 -0.10753 1024 23.982 14.397 4.4036 0.22415 0.19865 3.0578 -9.99e-29
3 31 3.0398 32.252 14.397 0.13741 1.0094 5.9737 260.54 101.94 5.8557 255.39 152.62 0.0002013 3.0614 32.252 14.397 -0.00012905 -0.00024194 246.76 39338 38069 -0.35504 1024 23.982 14.397 4.4052 0.2218 0.19815 2.9156 -3.2226e-30
4 98 3.8058 32.252 14.398 0.13692 1.052 5.9826 260.93 102.09 5.8556 255.39 131.18 0.00019675 3.8405 32.252 14.398 7.3469e-05 -0.00035509 246.76 39339 32514 -0.31612 1024 23.982 14.397 4.4057 0.21853 0.20287 2.8509 -6.3202e-29
5 32 5.0267 32.252 14.397 0.13738 1.0955 5.99 261.25 102.22 5.8556 255.39 110.7 0.00019822 5.0663 32.252 14.398 0.0004 -0.0004125 246.76 39339 35651 -0.09156 1024 23.982 14.397 4.4052 0.21957 0.20769 2.7771 0
6 76 6.0728 32.252 14.397 0.13706 1.1068 5.9849 261.03 102.13 5.8556 255.39 91.28 0.00019676 6.1242 32.252 14.397 0.00010393 -0.00041053 246.76 30540 36233 -0.35731 1024 23.982 14.396 4.4056 0.21853 0.20898 2.6938 -3.1547e-29
7 13 6.6319 32.252 14.367 0.13656 1.0951 5.9974 261.57 102.34 5.8556 255.39 89.053 0.00019123 6.6817 32.252 14.397 0.00018333 -0.0004166 246.76 39340 39340 -0.28558 1024 23.982 14.396 4.4061 0.21462 0.20768 2.6834 0
Matthew Suddith
2020-11-30
Thank you so much, that looks like it could work for what I need to do! But I get this error message: "Error using round
First argument must be a numeric, logical, or char array."
Matthew Suddith
2020-11-30
Also, when this creates a table, would I be able to turn that table back into a txt file?
Cris LaPierre
2020-11-30
This likely means at least one of your depth values contains a value that cannot be rounded. Does one of the rows contain a non-numeric value? Does it work with a subset of the actual data?
Matthew Suddith
2020-11-30
Ok, yes the problem is the actual data file has a header on top that isnt numbers. So I need to make it read the file, but beginning at a certain line: needs to start at line 365.
Cris LaPierre
2020-12-1
Do any of those header lines contain variable names identifying what is in each column?
Cris LaPierre
2020-12-1
There are 10 rows unaccounted for. This head has 354 rows of data, but you mentioned the numbers start at row 365. Are there blank rows between the header and the first row of numbers? Could you attach a subset of your data file containing the first 1000 rows including the header?
Cris LaPierre
2020-12-1
编辑:Cris LaPierre
2020-12-1
There are numerous ways to do this, but probably the easiest to understand is this.
data = readtable("first1000.txt",'NumHeaderLines',354,"MultipleDelimsAsOne",true,"LeadingDelimitersRule","ignore");
data.Properties.VariableNames(1) = "Depth";
% Create groups by rounding the depths to integer values
data.grpDepth = round(data.Depth);
newData = groupsummary(data,"grpDepth","mean")
Matthew Suddith
2020-12-1
That worked with the complete file, I hate to keep asking you questions, but how would I then change the name of each mean_variable# column header to the variable it is supposed to be?
Cris LaPierre
2020-12-1
This has to be done manually. Using the names I see in the header, something like this at the end should do it.
newData.Properties.VariableNames(3:end) = ["depSM","sal00","t090C","CStarAt0",...
"flECO-AFL","sbeox0ML/L","sbox0Mm/Kg","sbeox0PS","oxsatML/L","oxsatMm/Kg",...
"par","turbWETbb0","prDM","sal11","t190C","T2-T190C","secS-priS","timeJ",...
"c0uS/cm","c1uS/cm","C2-C1uS/cm","density00","sigma-theta00","potemp090C",...
"v4","v3","v2","v5","flag"]
Cris LaPierre
2020-12-1
WRT the questions asked here
I really don't have enough information to answer that question. What file are you comparing it to? How were the data grouped and averaged in that file?
You could try using a method other than round.
I'm not sure how using a for/while loop helps you here.
Matthew Suddith
2020-12-1
The file that I'm comparing it to is the "binned" version of the file I showed you, the one with the 70,000 lines. But I don't know how it was grouped and averaged. I'm happy with the rounding method you showed me, I really appreciate your help. I may continue posting a few more tiny questions, but I don't expect you to answer all night.
Cris LaPierre
2020-12-1
编辑:Cris LaPierre
2020-12-1
Try using fix (assigns 2-2.9 a value of 2) or ceil (assigns 2.01-3 a vaue of 3) instead of round. It's a simple change to make to the code, and together form the 3 most likely methods used.
Matthew Suddith
2020-12-1
so replacing round(data.Depth) with one of those gives me an error on the = both ways
Cris LaPierre
2020-12-1
Round, ceil, fix and floor all worked for me. I think you have a syntax error. I suggest reading the documentation I linked to previously to see how to use them. You should just have to replace "round" in the current code with "ceil", for example.
Yes, it is possible to use writetable to save a table to a text file. Again. read the documentation I linked to previously to see how to do it.
Matthew Suddith
2020-12-1
Is there a way to format the output of writetable, because on the txt file it outputs it is a big jumbled mess
Cris LaPierre
2020-12-1
It looks like by default it writes a csv file. You could look at the name-value pairs for what options are available.
Matthew Suddith
2020-12-2
That worked. How could I plot my rounded data and the original data in one plot for a direct comparison, is that doable?
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
标签
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)