error using textscan to read a csv file with number and text and date in one file
1 次查看(过去 30 天)
显示 更早的评论
Error using textscan Unable to read the DATETIME data with the format 'HH:mm:ss'. If the data is not a time, use %q to get string data.
One example data file I used is attached.
I tried several things, but no success. Please advise a good way of reading the data.
I try to write a code to read such a file. However, importdata can't handle the mixing types, and readtable don't skip the first 4 lines. Another try of using textscan as below gave me errors:
formatSpec = '%{HH:mm:ss}D,%C,%f,%f,%f,%f,%f,%f,%f,%{HH:mm:ss}D,%f,%f,%f,%f';
fid=fopen(cpc_name);
for j1=1:4
tmp1=fgetl(fid);
if j1==2
dstr=tmp1(6:14);
end
end
C=textscan(fid,formatSpec);
fclose(fid);
0 个评论
采纳的回答
Peter Perkins
2016-12-19
Fan, leave the commas out fo the format and use readtable:
>> t = readtable('test1_V6.csv','Format','%{HH:mm:ss}D%f%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f','Header',3)
Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the
VariableDescriptions property.
t =
Time Comment ModuleID Alt_m_ P_mbar_ T_degC_ x_RH Wind_degrees_ Wind_m_s_ Supply_V_ UTCTime Latitude_deg_ Longitude_deg_ Course_deg_ Speed_m_s_
________ _______ ________ ______ _______ _______ ____ _____________ _________ _________ ________ _____________ ______________ ___________ __________
23:09:10 NaN F667D9 54.2 999.9 16.3 39.2 236 4.1 11.8 23:09:00 70.495 -149.89 78 0.02
23:09:11 NaN F667D9 54.2 999.9 16.3 39 237 3.9 11.8 23:09:05 70.495 -149.89 78 0.01
23:09:15 NaN F667D9 54.2 999.8 16.3 39.1 236 4.1 11.8 23:09:10 70.495 -149.89 78 0.01
23:09:20 NaN F667D9 54.2 999.9 16.3 38.9 236 4.1 11.8 23:09:15 70.495 -149.89 78 0.01
23:09:25 NaN F667D9 54.2 999.8 16.3 38.8 236 3.9 11.8 23:09:20 70.495 -149.89 78 0.02
23:09:30 NaN F667D9 54.2 999.8 16.3 38.7 236 4.1 11.8 23:09:25 70.495 -149.89 78 0.02
23:09:32 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 23:09:30 70.495 -149.89 78 0.02
23:09:36 NaN F667D9 54.2 999.8 16.3 39.4 235 4.1 11.8 23:09:35 70.495 -149.89 78 0.01
23:09:41 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 23:09:40 70.495 -149.89 78 0.01
23:09:46 NaN F667D9 54.2 999.8 16.3 39 236 4.1 11.8 23:09:45 70.495 -149.89 78 0.01
23:09:51 NaN F667D9 54.2 999.9 16.3 38.8 235 3.9 11.8 23:09:50 70.495 -149.89 78 0
23:09:56 NaN F667D9 54.2 999.8 16.3 39.1 235 4.1 11.8 23:09:55 70.495 -149.89 78 0.01
The times had no date portion, so they were read as "today", but the file header indocates that they're from 6/11/2016. Modify the two time variables accordingly:
>> t.Time = datetime(2016,6,11) + timeofday(t.Time);
>> t.UTCTime = datetime(2016,6,11) + timeofday(t.UTCTime)
t =
Time Comment ModuleID Alt_m_ P_mbar_ T_degC_ x_RH Wind_degrees_ Wind_m_s_ Supply_V_ UTCTime Latitude_deg_ Longitude_deg_ Course_deg_ Speed_m_s_
____________________ _______ ________ ______ _______ _______ ____ _____________ _________ _________ ____________________ _____________ ______________ ___________ __________
11-Jun-2016 23:09:10 NaN F667D9 54.2 999.9 16.3 39.2 236 4.1 11.8 11-Jun-2016 23:09:00 70.495 -149.89 78 0.02
11-Jun-2016 23:09:11 NaN F667D9 54.2 999.9 16.3 39 237 3.9 11.8 11-Jun-2016 23:09:05 70.495 -149.89 78 0.01
11-Jun-2016 23:09:15 NaN F667D9 54.2 999.8 16.3 39.1 236 4.1 11.8 11-Jun-2016 23:09:10 70.495 -149.89 78 0.01
11-Jun-2016 23:09:20 NaN F667D9 54.2 999.9 16.3 38.9 236 4.1 11.8 11-Jun-2016 23:09:15 70.495 -149.89 78 0.01
11-Jun-2016 23:09:25 NaN F667D9 54.2 999.8 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:20 70.495 -149.89 78 0.02
11-Jun-2016 23:09:30 NaN F667D9 54.2 999.8 16.3 38.7 236 4.1 11.8 11-Jun-2016 23:09:25 70.495 -149.89 78 0.02
11-Jun-2016 23:09:32 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:30 70.495 -149.89 78 0.02
11-Jun-2016 23:09:36 NaN F667D9 54.2 999.8 16.3 39.4 235 4.1 11.8 11-Jun-2016 23:09:35 70.495 -149.89 78 0.01
11-Jun-2016 23:09:41 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:40 70.495 -149.89 78 0.01
11-Jun-2016 23:09:46 NaN F667D9 54.2 999.8 16.3 39 236 4.1 11.8 11-Jun-2016 23:09:45 70.495 -149.89 78 0.01
11-Jun-2016 23:09:51 NaN F667D9 54.2 999.9 16.3 38.8 235 3.9 11.8 11-Jun-2016 23:09:50 70.495 -149.89 78 0
11-Jun-2016 23:09:56 NaN F667D9 54.2 999.8 16.3 39.1 235 4.1 11.8 11-Jun-2016 23:09:55 70.495 -149.89 78 0.01
3 个评论
Jeremy Hughes
2016-12-20
I'll add to Peter's comment here. The reason the original call to textscan fails is that it expects to find spaces between each field. The default value for 'Delimiter' in TEXTSCAN is the same as 'Whitespace'
The initial code could be improved by removing the commas from the format string passed to textscan (as Peter suggested) and also by adding 'Delimiter',',' to the arguments.
Although I like Peter's solution better, the following should work:
fid = fopen(cpc_name);
for j1 = 1:4
tmp1 = fgetl(fid);
if j1==2
dstr = tmp1(6:14);
end
end
formatSpec = '%{HH:mm:ss}D%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f';
C = textscan(fid,formatSpec,'Delimiter',',');
fclose(fid);
Incidentally, this looks like a CSV exported from Microsoft® Excel®. (It puts extra commas for empty cells when exporting to CSV) If you have the original spreadsheet file, you might have luck reading that directly with READTABLE.
T = readtable(ssFileName,'Range','A4:O16')
更多回答(1 个)
KSSV
2016-12-16
编辑:KSSV
2016-12-16
fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',3);
fclose(fid)
S = S{1}
2 个评论
KSSV
2016-12-16
fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',4);
S = S{1} ;
fclose(fid) ;
iwant = cell(12,14) ;
for i = 1:size(S,1)
iwant(i,:) = strsplit(S{i},',') ;
end
iwant
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Large Files and Big Data 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!