Average data through the date

7 次查看(过去 30 天)
Thaís Fernandes
Thaís Fernandes 2019-1-26
回答: Rajanya 2024-11-19,11:28
Hello.
I have a vector (date data) of type: jd=[734530.22333; 734550.90236; 734530.22333; 732437.11034; 732447.30517; 731886.80703; 731896.72450; 733920.88758];
If I used the 'datevec' command the dates would be: jd1=[ 2011 1 27; 2011 2 16; 2011 1 27; 2005 5 5; 2005 5 15; 2003 11 1; 2003 11 11; 2009 5 27];
I also have the temperature vector: tem =[ 25.177 26.624; 25.177 26.624; 25.199 26.544; 22.464 22.424 ; 22.464 22.424; 21.197 23.232; 21.197 23.232; 17.442 15.325];
I would like to calculate an average of the temperature data by date (year, month and day), ie (in this example it would be the average of lines 1 and 3, which are those that repeat), giving something like:
med =[ 25.188 26.584; 25.177 26.624; 22.464 22.424 ; 22.464 22.424; 21.197 23.232; 21.197 23.232; 17.442 15.325];
I looped year, month and day, but my averages only give NaN.
Can someone help me sort this out or point me to a better solution?
Thanks.
% my code
[yy,mm,dd]=datevec(jd);
for i=2002:2017 % loop in year
iyy2=find(yy==i);
tem2=tem(:,iyy2); % because it has an array
mm2=mm(iyy2);
dd2=dd(iyy2);
for j=1:12 % loop in month
imm2=find(mm2==j);
tem3=tem2(:,imm2);
dd3=dd2(imm2);
for k=1:31 % loop in day
idd2=find(dd3==k);
tem4=tem3(idd2);
if i==2002 && j==1 && k==1
A=[nanmean(tem4)];
else
A=[A; nanmean(tem4)];
end
med=A;
end
end
end

回答(1 个)

Rajanya
Rajanya 2024-11-19,11:28
From the example provided, it appears you want to calculate average temperatures for specific dates by grouping the temperatures by their respective dates, resulting in a two-column matrix ‘med’.
Initially, the code you have will fail with an "out of array index" error at the line:
tem2=tem(:,iyy2);
This error occurs because ‘tem’ has only 2 columns, while ‘iyy2’ might not be restricted to this column size. It seems you intended to use:
tem2 = tem(iyy2, :);
After correcting this, you can actually verify that ‘med’ also does not contain all ‘NaN’ values by using:
all(isnan(med))
This is because the indices, where the loop successfully matches a date in the date vector, will contain some average values.
However, this approach will not produce the desired output as illustrated in your example. A more precise and computationally efficient method involves identifying the indices of duplicate dates in the date vector first. This can be done using the 'unique' function from MATLAB. Then, calculate the average of the temperatures at those indices for both columns. Pre-allocating the resultant vector can further reduce computation time by avoiding size changes during each loop iteration. A sample code to achieve this would look like:
date=[yy mm dd]
[unique_dates,~,idx]=unique(date,'rows')
Atemp = zeros(size(unique_dates,1),2);
for i = 1:size(idx)
index = idx(i);
indices = (idx==index)
if isempty(tem(indices,:)) || (Atemp(index)~=0)
continue
end
if size(indices,1) > 1 %implies presence of duplicates
Atemp(index,:) = mean(tem(indices,:));
else
Atemp(index,:) = tem(indices,:);
end
end
The resulting ‘Atemp’ aligns with the expected ‘med’ output, calculated with the dates arranged in sorted order.
For more information about 'unique', you can access its documentation page by entering the following command in the MATLAB Command Window:
doc unique
Hope this helps!

类别

Help CenterFile Exchange 中查找有关 Dates and Time 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by