how to select data based on month

I have thousand of data points (Nx2, col 1=date and col 2=data)for a year for different dates for different sites. I want to make an average for the data for each month. My data are in excel and and date format is yyyymmdd (attached image).I easily can sort data suing sortrows and then I can calculate the average as below. But this is not an efficient method.Looking for an efficient method. Thanks in advance. M1=mean(data(1:30,2)); M2=mean(data(31:69, 2)); M3=mean(data(70:112, 2)); ..............

 采纳的回答

Here is what you should do:
I have assumed that you store those two columns in a csv file as follow:
% reading the CSV File
data=csvread('sample.csv');
% extracting data into a MATLAB Table variable
T=table();
T.year=floor(data(:,1)/10000);
T.mm=floor((data(:,1)-T.year*10000)/100);
T.dd=data(:,1)-T.year*10000-T.mm*100;
T.value=data(:,2);
Now To get yearly average:
grpstats(T,'year','mean','DataVars',{'value'})
ans =
year GroupCount mean_value
____ __________ __________
2010 2010 365 0.49589
2011 2011 365 0.51351
2012 2012 366 0.50263
2013 2013 365 0.52623
2014 2014 365 0.51586
2015 2015 360 0.5009
To get monthly average
grpstats(T,{'year','mm'},'mean','DataVars',{'value'})
ans =
year mm GroupCount mean_value
____ __ __________ __________
2010_1 2010 1 31 0.5204
2010_2 2010 2 28 0.57318
2010_3 2010 3 31 0.494
2010_4 2010 4 30 0.41664
2010_5 2010 5 31 0.49894
2010_6 2010 6 30 0.47679
2010_7 2010 7 31 0.46417
// I have chopped rest of the results.
I have attached the sample.csv for you to compare the format of your data set to this one. This data is randomly generated.
for more information on the command refer to grpstat()
P.S. please include some data next time.

7 个评论

Thanks Mohammad. It is working perfectly.
Hi Mohammad, Now I have to subtract this average for each month from the corresponding date of each month. For example, I have to subtract the average of January from all January data, average of February from all February data and so on. I appreciate your help in this regard. Thank you so much.
the easiest is to use join() to repopulate mean to each year,month combination, (use both year and month as the key for joining) then simply type T.value - T.mean_value (change the column names accoridingly. Later you can delete the monthly mean column from T.
Join function is very useful to do that. Thanks Mohammad.
Hi Mohammed, Great answer, works very well! Is it possible to average more than just one column of data? If so how is this possible? Many thanks!
Hi, Stephanie,
Yes it is possible. You can list all the data columns that you want as DataVars. something like this:
grpstats(T,{'year','mm'},'mean','DataVars',{'value','Column2', 'column3'})
where column2 and column3 should be the name of the other data columns that you want.
If you want all columns then simply don't provide any 'DataVars' option; so something like:
grpstats(T,{'year','mm'},'mean')
If you use this, it automatically calculates the mean for all data columns.

请先登录,再进行评论。

更多回答(1 个)

Or with a time vector t in datenum format, you can use downsample_ts. Syntax would be:
monthlymean_x = downsample_ts(x,t,'monthly','mean');

类别

帮助中心File Exchange 中查找有关 Calendar 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by