Datastore won't recognize datetime in CSV files (Matlab 2019b)

3 次查看(过去 30 天)
Hello,
I'm trying to evaluate a datastore of CSV data that I saved with Matlab using writetable. One column contains datetimes and an example of the files' contents is this:
29-Jul-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.8979
31-Aug-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.9029
30-Sep-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.9106
31-Oct-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.9154
30-Nov-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.9227
30-Dec-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.9311
I tried the following code and received the subsequent error in the postscript. When I use datastore with "DatetimeType" set to "text," it works, but that is obviously inefficient. Can someone enlighten me on how to get this to work?
Thank You,
Michael
This code works
ds = datastore('tall.csv','DatetimeType','text');
tds = tall(ds);
u = unique(tds.FIELD);
U = gather(u);
This code fails
ds = datastore('tall.csv');
tds = tall(ds);
u = unique(tds.FIELD);
U = gather(u);
The error is
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: 0% complete
Evaluation 0% complete
Error using matlab.io.datastore.TabularTextDatastore/readData (line 77)
Unable to read the DATETIME data using the locale setting for your system: 'en_US'
If the data contains month or day names in a language foreign to this locale, use the 'DateLocale' parameter to specify the correct locale.
Learn more about errors encountered during GATHER.
Error in matlab.io.datastore.TabularDatastore/read (line 120)
[t, info] = readData(ds);
Error in tall/gather (line 50)
[varargout{:}, readFailureSummary] = iGather(varargin{:});

采纳的回答

Michael
Michael 2019-8-29
Dear Mr. Robertson,
I changed the original input CSV files to MM/dd/yyyy and it worked so I'm going to give up. If anyone at the Mathworks is reading, it would be great if we could input a DatetimeFormat when specifying the datastore.
Thanks,
Michael

更多回答(5 个)

Nimit Dhulekar
Nimit Dhulekar 2019-8-27
Hi Michael,
Are you executing these set of commands from outside the US? If so, the datetime formats available would be different from the ones available in the US. Try the following command:
datetime('29-Jul-1983 00:00:00')
You might quite possibly get an error similar to the one you posted. To get around this issue, you can supply "DatetimeLocale" as a Name-Value pair when constructing the datastore.
ds = datastore('tall.csv','DatetimeLocale','en_US');
Hope that helps!
-Nimit

Michael
Michael 2019-8-27
Dear Mr. Robertson,
Thank you for responding.
I am using MATLAB version 9.6.0.1150989 (R2019a) Update 4, Windows 10 Pro Version 10.0 (Build 18362), and Java 1.8.0_181-b13.
The column names are DATE, TICKER, FIELD, and VALUE, so for the first line:
29-Jul-1983 00:00:00,BHP AT EQUITY,MOV_AVG_50D,0.8979
DATE is represented in this format:
29-Jul-1983 00:00:00
FIELD is
MOV_AVG_50D
But, I don't think FIELD is the problem, even though it's what I'm operating on, for two reasons:
  1. The error reads "Unable to read the DATETIME"
  2. The error disappears when I add the parameter pair 'DatetimeType','text' to the datastore command.
Thank You,
Michael

Michael
Michael 2019-8-28
Dear Nimit,
Thank you, but that does not produce an error:
>> datetime('29-Jul-1983 00:00:00')
ans =
datetime
29-Jul-1983 00:00:00
Best,
Michael

Michael
Michael 2019-8-28
Dear Mr. Roberson,
Good idea. So, with that in mind, I tried all the dates and they didn't produce errors! See below.
Any other thoughts?
Thanks,
Michael
Input
ds = datastore('bigtall.csv','DatetimeType','text');
tds = tall(ds);
u = unique(tds.DATE);
U = gather(u);
for i=1:length(U) b(i) = isnat(datetime(U{i})); end
any(b)
Output
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 4).
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: Completed in 25 min 22 sec
Evaluation completed in 25 min 29 sec
ans =
logical
0
  1 个评论
Walter Roberson
Walter Roberson 2019-8-28
And if you use tds.FIELD does it go back to failing? Is it possible that it has decided that tds.FIELD is a datetime ?

请先登录,再进行评论。


Michael
Michael 2019-8-28
Dear Mr. Robertson,
This code workes and uses tds.FIELD. Is that what you mean? I'm not sure I understand how to answer your question.
ds = datastore('tall.csv','DatetimeType','text');
tds = tall(ds);
u = unique(tds.FIELD);
U = gather(u);
Thanks,
Michael
  6 个评论
Michael
Michael 2019-8-30
Thanks. I just converted the dates to MM/dd/yyyy in the text file and gave up.
Steve Gardner
Steve Gardner 2019-12-5
I to have a simular issue CSV files with the date format of yyyy/MM/dd, datastore converts this to MM/dd/yyyy which is fine but for anything after the 12th day of the month it gives the value as NaN, basically it gets the month and day fields mixed up.
I too gave up and converted the CSV file date field to MM/dd/yyyy, just need to remember to do this evey time I get a new CSV file, bit of a pain really.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Tall Arrays 的更多信息

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by