Datenum - different date formats

4 次查看(过去 30 天)
Background
I have an excel sheet with different date formats (in excel some are in General format, some in date format totally random)
example: 10/05/2005 , 10/5/2005, 10/05/05, 05/12/2005 and so on.
I read the XL file into MATLAB as follows:
[dta_num , dta_txt , dta_raw] = xlsread(...)
I need to work with the dates in the excel sheet subsequently.
Actual Problem
I tried out the following commands
  • datenum('10/5/2005')
  • datenum('10/05/2005')
  • datenum('10/05/05')
  • datenum(dta_txt(2,2))
All of them return the same answer. (732590) But I realized that 732590 is read as 05-Oct-2005 when I ran
datestr(732590)
However the dates in my data sheet needs to be read as dd/mm i.e 10-May-2005.
If I choose to specify the format,
  1. * datenum('10/5/2005','dd/mm/yyyy') ans = 732442
  2. * datenum('10/05/2005','dd/mm/yyyy') ans = 732442
  3. * datenum('10/05/05','dd/mm/yyyy') ans = 1957
Originally I thought I will convert the dates using datenum and the after my manipulations using functions such as month(datenum(dta_txt(2,2),'dd/mm/yyyy')) I will use datestr and I will have all the dates in the same format. But now I am back to square 1 since datestr(1957) gives me a 2 digit year.
Also I want to make sure that date formats are not mixed up i.e dd/mm/yyyy and mm/dd/yyyy are mixed up. For this I plan to go over the date and see if the previous date belongs to the same month or one month earlier and no month is greater than 12 assuming a dd/mm format. If it is, then it is to be decoded as mm/dd/yyyy and appropriately changed to dd/mm/yyyy.
Is the only solution is to go over the length of the dta_txt contents using cellfun('length',dta_txt(:,2)) and change it to 4 year format (I guess there are very few in 2 year digit YY format)
To summarize my requirements are;
1. date format which takes less resource and easy to manipulate as I might want to extract and make comparisons of the the month and year later. 2. Have a consistent date format preferably dd/mm/yyyy.

采纳的回答

Laura Proctor
Laura Proctor 2012-12-23
编辑:Laura Proctor 2012-12-23
Would something like this help?
% Create a cell array containing dates in various formats
y = { '10/5/05' ; '18/2/2004' ; '3/3/2003' }
% Split up the elements in the cell array
[~,~,~,~,~,~,ss] = regexp(y,'/')
% Determine which dates have a two element year
ivals = cellfun(@(x) numel(x{3})==2,ss)
% Preallocate a vector to contain the dates
dates = zeros(size(y))
% Convert those dates with a four element year using datenum
dates(~ivals) = datenum(y(~ivals),'dd/mm/yyyy')
% Convert the dates with a two element year
dates(ivals) = datenum(y(ivals),'dd/mm/yy')
You could also use datevec to pull out the elements separately. Here I have assumed that any two element years that occur are later than 2000. If you also have dates that occur in the 1900s, then a little more logic is required.
% Create three vectors: year, month, and day
[year,month,day,~,~,~] = datevec(y,'dd/mm/yyyy')
% Logical array containing those year values that are less than 2000
lt2000 = year < 2000
% Add 2000 to any year that are less than 2000
year(lt2000) = 2000+year(lt2000)
  3 个评论
bym
bym 2012-12-23
This works for 2 digit years
datenum('10/05/05','dd/mm/yyyy',2000)
ans =
732442
Laura Proctor
Laura Proctor 2012-12-23
编辑:Laura Proctor 2012-12-23
You're right, proecsm - it does work if you just have one date. However, if you have a cell array of dates, it will not return the same results.
Walter posted a great solution in the duplicate post:
dates = datenum(y,'dd/mm/yyyy',2000)

请先登录,再进行评论。

更多回答(1 个)

per isakson
per isakson 2012-12-23
编辑:per isakson 2012-12-23
There is no way for me (in Sweden) to know how to interpret
'01/02/03'
or
'01/02/2003'
IMO: when in doubt avoid to use the default values. It is better to specify the format an extra time than not to do it when needed.
IMO: the most precious resource is the time of the programmer.
In this case a vectorized code is faster than a loop. Experiment:
>> mysteries_dates_to_sdn
Elapsed time is 1.540982 seconds.
Elapsed time is 0.056104 seconds.
>> mysteries_dates_to_sdn
Elapsed time is 1.508512 seconds.
Elapsed time is 0.055982 seconds.
>>
where
%%mysteries_dates_to_sdn
N = 1e3;
castr = {
'10/05/2005'
'10/5/2005'
'10/05/05'
'05/12/2005'
};
castr = repmat( castr, [N,1] );
tic
sdn_loop = mysteries_dates_to_sdn_loop( castr );
toc
tic
sdn_vec = mysteries_dates_to_sdn_vectorized( castr );
toc
assert( all( sdn_vec == sdn_loop ) ...
, 'mysteries_dates_to_sdn:failure' ...
, 'Failure: sdn_loop differs from sdn_vec' )
and
function sdn = mysteries_dates_to_sdn_loop( castr )
sdn = nan( size( castr ) );
for ii = 1 : numel( castr )
str = castr{ii};
len = length( str );
if len >= 9
sdn(ii) = datenum( str, 'dd/mm/yyyy' );
else
sdn(ii) = datenum( str, 'dd/mm/yy' );
end
end
end
and
function sdn = mysteries_dates_to_sdn_vectorized( castr )
sdn = nan( size( castr ) );
len = cellfun( @length, castr );
islong = ( len >= 9 );
isshort = not( islong );
sdn(islong) = datenum( castr(islong) , 'dd/mm/yyyy' );
sdn(isshort) = datenum( castr(isshort), 'dd/mm/yy' );
end
.
BTW: Why not use datevec rather than datenum to avoid calling the function, month
.
More:
This function (proposed above) is approx. 7% faster than mysteries_dates_to_sdn_vectorized
function sdn = mysteries_dates_to_sdn_pivot( castr )
sdn = datenum( castr, 'dd/mm/yyyy', 2000 );
end

类别

Help CenterFile Exchange 中查找有关 Time Series Objects 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by