Combining text files with dates and missing data

5 次查看(过去 30 天)
I have numerous files in this format
YEAR MONTH DAY HOUR MIN VAL
1999 1 1 0 0 123
where the file contains data for a single year, month runs 1-12, day from 1-30 (or 28,29, or 31), hour goes from 0-23 and min almost always 0 and val represents a different type of value depending on the file (i.e. one file will have speed, another the mass, another the temperature of the object, etc.).
I want to combine the text files so that they end up like this
YEAR MONTH DAY HOUR MIN VAL1 VAL2 VAL3
1999 1 1 0 0 123 321 123
up to 7 values.
My problem is when data is missing/not recorded in one of the text files (e.g. Jan 1st 1999 Hours 0-23 has data but the next row is Jan 3rd 1999 Hour 0). When the data is missing, I'm not sure what to do as the arrays become uneven and I can't concatenate them unless I somehow fill in the gaps.
Is there a way to fill in any gaps with a blank or some non-data value like 99999 so that I can combine them? Or is there another way to combine the files? The latter is my goal.
  6 个评论
Walter Roberson
Walter Roberson 2015-8-12
To check: for each year there are 7 files? Or is there only 1 file that has 7 types of values in it? If it is 1 file that has 7 types of values in it, how do you tell which value-type a particular line is?
Cas Cas
Cas Cas 2015-8-12
Each year has a bunch of files, but 7 values in particular I'm interested in, so for the purpose here, basically for each year there are 7 files, each with a date and time and value. I'm trying to put them into one file.

请先登录,再进行评论。

采纳的回答

Walter Roberson
Walter Roberson 2015-8-12
Assuming the data is in DataVal_1, DataVal_2 through DataVal_7, and assuming that either only a single year is referenced or else that the first entry of DataVal_1 refers to the lowest numbered year to be considered:
filler_value = 9999;
first_hour = datenum(DataVal_1(:,1),1,1);
timekeyvec = DataVal_1(:,1:4); %Y,M,D,H
timekeyvec(end,6) = 0; %extend by 0 for min and sec
timekeynum = datenum(timekeyvec); %serial date numbers
houridx_1 = round((timekeynum - first_hour) * 24) + 1;
Do the same thing for the other DataVal_* arrays producing houridx_1 through houridx_7 . Then
lasthouridx = max( [houridx_1; houridx_2; houridx_3; houridx_4; houridx_5; houridx_6; houridx_7] );
mergedvals = filler_value * ones(lasthouridx, 7);
mergedvals(houridx_1,1) = DataVal_1(:,6);
mergedvals(houridx_2,2) = DataVal_2(:,6);
mergedvals(houridx_3,3) = DataVal_3(:,6);
etc.
date_of_entries = datevec(((1:lasthouridx) - 1) / 24 + first_hour);
merge_table = [date_of_entries(:,1:4), zeros(lasthouridx,1), mergedvals];
Now write out merge_table
(Testing further, I find it would be possible to build up the mergedvals a file at a time instead of having to read all 7 files first)
  3 个评论
Cas Cas
Cas Cas 2015-8-12
(((1:lasthouridx) - 1) / 24 + first_hour) has dimensions 2088 x 2160. Running that through datevec puts them in a column vector of (2088*2160) x 6.
Also, it appears that date_of_entries isn't repeating set of values. I checked and each hour entry repeats for a very long time (e.g. it repeats [1965 1 1 0 0 0] for a very long time before changing to [1965 1 2 0 0 0]). It should repeat in a (0:23)' pattern.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Time Series Objects 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by