error using textscan to read a csv file with number and text and date in one file

Question

Fan Mei 2016-12-16

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/317108-error-using-textscan-to-read-a-csv-file-with-number-and-text-and-date-in-one-file

评论： Jeremy Hughes 2016-12-20

test1_V6.csv

Error using textscan Unable to read the DATETIME data with the format 'HH:mm:ss'. If the data is not a time, use %q to get string data.

One example data file I used is attached.

I tried several things, but no success. Please advise a good way of reading the data.

I try to write a code to read such a file. However, importdata can't handle the mixing types, and readtable don't skip the first 4 lines. Another try of using textscan as below gave me errors:

formatSpec = '%{HH:mm:ss}D,%C,%f,%f,%f,%f,%f,%f,%f,%{HH:mm:ss}D,%f,%f,%f,%f'; 
fid=fopen(cpc_name);
        for j1=1:4
        tmp1=fgetl(fid);
            if j1==2
                dstr=tmp1(6:14);
            end
        end
        C=textscan(fid,formatSpec);
        fclose(fid);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Peter Perkins 2016-12-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/317108-error-using-textscan-to-read-a-csv-file-with-number-and-text-and-date-in-one-file#answer_247796

在 MATLAB Online 中打开

Fan, leave the commas out fo the format and use readtable:

>> t = readtable('test1_V6.csv','Format','%{HH:mm:ss}D%f%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f','Header',3)
Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the
VariableDescriptions property. 
t =
      Time      Comment    ModuleID    Alt_m_    P_mbar_    T_degC_    x_RH    Wind_degrees_    Wind_m_s_    Supply_V_    UTCTime     Latitude_deg_    Longitude_deg_    Course_deg_    Speed_m_s_
    ________    _______    ________    ______    _______    _______    ____    _____________    _________    _________    ________    _____________    ______________    ___________    __________
    23:09:10    NaN        F667D9      54.2      999.9      16.3       39.2    236              4.1          11.8         23:09:00    70.495           -149.89           78             0.02      
    23:09:11    NaN        F667D9      54.2      999.9      16.3         39    237              3.9          11.8         23:09:05    70.495           -149.89           78             0.01      
    23:09:15    NaN        F667D9      54.2      999.8      16.3       39.1    236              4.1          11.8         23:09:10    70.495           -149.89           78             0.01      
    23:09:20    NaN        F667D9      54.2      999.9      16.3       38.9    236              4.1          11.8         23:09:15    70.495           -149.89           78             0.01      
    23:09:25    NaN        F667D9      54.2      999.8      16.3       38.8    236              3.9          11.8         23:09:20    70.495           -149.89           78             0.02      
    23:09:30    NaN        F667D9      54.2      999.8      16.3       38.7    236              4.1          11.8         23:09:25    70.495           -149.89           78             0.02      
    23:09:32    NaN        F667D9      54.2      999.7      16.3       38.8    236              3.9          11.8         23:09:30    70.495           -149.89           78             0.02      
    23:09:36    NaN        F667D9      54.2      999.8      16.3       39.4    235              4.1          11.8         23:09:35    70.495           -149.89           78             0.01      
    23:09:41    NaN        F667D9      54.2      999.7      16.3       38.8    236              3.9          11.8         23:09:40    70.495           -149.89           78             0.01      
    23:09:46    NaN        F667D9      54.2      999.8      16.3         39    236              4.1          11.8         23:09:45    70.495           -149.89           78             0.01      
    23:09:51    NaN        F667D9      54.2      999.9      16.3       38.8    235              3.9          11.8         23:09:50    70.495           -149.89           78                0      
    23:09:56    NaN        F667D9      54.2      999.8      16.3       39.1    235              4.1          11.8         23:09:55    70.495           -149.89           78             0.01

The times had no date portion, so they were read as "today", but the file header indocates that they're from 6/11/2016. Modify the two time variables accordingly:

>> t.Time = datetime(2016,6,11) + timeofday(t.Time);
>> t.UTCTime = datetime(2016,6,11) + timeofday(t.UTCTime)
t =
            Time            Comment    ModuleID    Alt_m_    P_mbar_    T_degC_    x_RH    Wind_degrees_    Wind_m_s_    Supply_V_          UTCTime           Latitude_deg_    Longitude_deg_    Course_deg_    Speed_m_s_
    ____________________    _______    ________    ______    _______    _______    ____    _____________    _________    _________    ____________________    _____________    ______________    ___________    __________
    11-Jun-2016 23:09:10    NaN        F667D9      54.2      999.9      16.3       39.2    236              4.1          11.8         11-Jun-2016 23:09:00    70.495           -149.89           78             0.02      
    11-Jun-2016 23:09:11    NaN        F667D9      54.2      999.9      16.3         39    237              3.9          11.8         11-Jun-2016 23:09:05    70.495           -149.89           78             0.01      
    11-Jun-2016 23:09:15    NaN        F667D9      54.2      999.8      16.3       39.1    236              4.1          11.8         11-Jun-2016 23:09:10    70.495           -149.89           78             0.01      
    11-Jun-2016 23:09:20    NaN        F667D9      54.2      999.9      16.3       38.9    236              4.1          11.8         11-Jun-2016 23:09:15    70.495           -149.89           78             0.01      
    11-Jun-2016 23:09:25    NaN        F667D9      54.2      999.8      16.3       38.8    236              3.9          11.8         11-Jun-2016 23:09:20    70.495           -149.89           78             0.02      
    11-Jun-2016 23:09:30    NaN        F667D9      54.2      999.8      16.3       38.7    236              4.1          11.8         11-Jun-2016 23:09:25    70.495           -149.89           78             0.02      
    11-Jun-2016 23:09:32    NaN        F667D9      54.2      999.7      16.3       38.8    236              3.9          11.8         11-Jun-2016 23:09:30    70.495           -149.89           78             0.02      
    11-Jun-2016 23:09:36    NaN        F667D9      54.2      999.8      16.3       39.4    235              4.1          11.8         11-Jun-2016 23:09:35    70.495           -149.89           78             0.01      
    11-Jun-2016 23:09:41    NaN        F667D9      54.2      999.7      16.3       38.8    236              3.9          11.8         11-Jun-2016 23:09:40    70.495           -149.89           78             0.01      
    11-Jun-2016 23:09:46    NaN        F667D9      54.2      999.8      16.3         39    236              4.1          11.8         11-Jun-2016 23:09:45    70.495           -149.89           78             0.01      
    11-Jun-2016 23:09:51    NaN        F667D9      54.2      999.9      16.3       38.8    235              3.9          11.8         11-Jun-2016 23:09:50    70.495           -149.89           78                0      
    11-Jun-2016 23:09:56    NaN        F667D9      54.2      999.8      16.3       39.1    235              4.1          11.8         11-Jun-2016 23:09:55    70.495           -149.89           78             0.01

3 个评论
显示 1更早的评论隐藏 1更早的评论

Fan Mei 2016-12-19

Thank you so much,Peter Perkins.

Jeremy Hughes 2016-12-20

在 MATLAB Online 中打开

I'll add to Peter's comment here. The reason the original call to textscan fails is that it expects to find spaces between each field. The default value for 'Delimiter' in TEXTSCAN is the same as 'Whitespace'

The initial code could be improved by removing the commas from the format string passed to textscan (as Peter suggested) and also by adding 'Delimiter',',' to the arguments.

Although I like Peter's solution better, the following should work:

fid = fopen(cpc_name);
for j1 = 1:4
    tmp1 = fgetl(fid);
    if j1==2
        dstr = tmp1(6:14);
    end
end
formatSpec = '%{HH:mm:ss}D%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f'; 
C = textscan(fid,formatSpec,'Delimiter',',');
fclose(fid);

Incidentally, this looks like a CSV exported from Microsoft® Excel®. (It puts extra commas for empty cells when exporting to CSV) If you have the original spreadsheet file, you might have luck reading that directly with READTABLE.

T = readtable(ssFileName,'Range','A4:O16')

请先登录，再进行评论。

Answer 2

KSSV 2016-12-16

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/317108-error-using-textscan-to-read-a-csv-file-with-number-and-text-and-date-in-one-file#answer_247418

编辑：KSSV 2016-12-16

在 MATLAB Online 中打开

fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',3);
fclose(fid)
S = S{1}

2 个评论
显示无隐藏无

Fan Mei 2016-12-16

Thank you for your help. I hope to separate the columns after loading the file. For example, I can have a column of date, and several other columns with numerical data. Any suggestions?

KSSV 2016-12-16

在 MATLAB Online 中打开

fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',4);
S = S{1} ;
fclose(fid) ;
iwant = cell(12,14) ;
for i = 1:size(S,1)
    iwant(i,:) = strsplit(S{i},',') ;
end
iwant

请先登录，再进行评论。

error using textscan to read a csv file with number and text and date in one file

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论
显示 1更早的评论隐藏 1更早的评论

更多回答（1 个）

2 个评论
显示无隐藏无

另请参阅

类别

标签

Community Treasure Hunt

error using textscan to read a csv file with number and text and date in one file

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论 显示 1更早的评论隐藏 1更早的评论

更多回答（1 个）

2 个评论 显示 无隐藏 无

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

2 个评论
显示无隐藏无