Read csv with quotes and datetime stamp (readtable configuration)
6 次查看(过去 30 天)
显示 更早的评论
Hi,
I have a csv file which contains the following format for each line. I understand how fscanf can be used for parsing data if the first digit is just a number: https://www.mathworks.com/matlabcentral/answers/1805220-read-csv-with-quotes-and-numbers. I tried setting this up for a datetime entry in the first field. However this gets tricky as I have to scan individual fields in the date and time if using fscanf.
Is there a more efficient way to parse and scan this using readtable where I can use the datatime variable format so that the date and time can be directly included in the post processing of the data?
Unfortunately I am not able to remove the quotes at this time due to the way in which the datafile is generated. Thanks a lot !
Data format:
14-Sep-2022 10:02:38.140000,"1034,19,AB"
14-Sep-2022 10:02:38.140000,"1034,19,AB"
0 个评论
采纳的回答
Walter Roberson
2022-9-14
lines = {'14-Sep-2022 10:02:38.140000,"1034,19,AB"', '14-Sep-2022 10:02:38.140000,"1034,19,AB"'};
filename = tempname + ".txt";
writelines(lines, filename);
dbtype(filename)
fmt = '%{dd-MMM-uuuu}D %{hh:mm:ss.SSSSSS}T"%f%f%[^"]"';
fid = fopen(filename, 'r');
datacell = textscan(fid, fmt, 'whitespace', ',');
fclose(fid);
Time = datacell{1} + datacell{2};
Time.Format = 'dd-MMM-uuuu HH:mm:ss.SSSSSS';
T = table(Time, datacell{3:end})
2 个评论
Walter Roberson
2022-9-15
There are two tricks here:
- it is difficult to get a %D format to include a space when parsing a datetime specification. It is possible by changing the Whitespace specification, but that tends to mess up other fields. So it is typically easier to read the date portion with %D and the time portion with %T and then add the two together
- When using datetime() the input format specification for 24 hour days is HH and where hh would be used for 12 hour days (with there typically being an 'a' field for AM/PM indicator). But when you are using duration() then the number of hours is purely a count, not a time of day -- for example 78:32 would be valid for 78 hours 32 minutes for duration() purposes. So there is a conflict between datetime() and duration() as to what to use for hours beyond 12 in a day -- you need HH for datetime but hh for duration. You cannot simply break a datetime() format up into date and time portions; you have to change the HH of datetime to hh for duration purposes.
更多回答(1 个)
dpb
2022-9-14
readtable will import the file time field as a datetime variable automagically; you don't need to do anything except read it.
You'll then have to split the second column that will be imported as a cellstr array
I thought could add the " to the 'Whitespace' but that didn't work by itself -- would take a fair amount of effort to modify an import options object to account for the additional variables; think it's simpler to just import as is and split. Or, of course, you could read the file as text and delete the " and rewrite or parse from memory instead.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Dates and Time 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!