readtable of csv file with opts.DataLines =[n1 n2] and n1>2 doesn't work as expected

15 次查看(过去 30 天)
Hello,
I'm trying to read a csv file by blocks, according to documentation, this shoudl work:
dir_load='some_dir';
file='some_file';
filename=fullfile(dir_load,file);
opts = detectImportOptions(filename);
opts.DataLines = [1 10];
T1=readtable(filename,opts);
opts.DataLines = [11 20];
T2=readtable(filename,opts);
opts.DataLines = [1 20];
T=readtable(filename,opts);
So, this T should be "[T1;T2]", but what i got is that T1 actually have lines 1 to 10 and T2 contains lines 6 to 15. What I'm doing wrong? You can find the file here.
T =
20x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
21.63 22.63 502 0.135
21.63 22.63 503 0.139
21.63 22.63 503 0.134
21.63 22.63 508 0.136
21.63 22.63 505 0.142
T1 =
10x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
T2 =
10x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
edited by Guillaume to attach the file to the question. Please don't use external file sharing sites

采纳的回答

Guillaume
Guillaume 2019-6-25
If you look at the actual content of the file, you see that it has a blank line between each line of data. Although blank lines are ignored by default during reading, they still count for the purpose of line counting, so it's normal that line 11 is only the 6th line of data (because of the 5 blank lines ignored).
Now, there is indeed a bug with the end point of DataLines. For me (R2019a), I get
>> opts.DataLines = [1 10];
>> readtable('ex.txt', opts)
ans =
5×4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
>> opts.DataLines = [11 20];
>> readtable('ex.txt', opts)
ans =
10×4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
The result with DataLines = [1 10] I expected. The result wit DataLines = [11 20] has too many rows.
I will investigate a bit more then report to mathworks.
  2 个评论
kira
kira 2019-6-25
so odd, i think i created the file without blank lines. Without blank lines it works fine, I don't see too many rows, but I'm in R2018b...
Guillaume
Guillaume 2019-6-25
Yes, the problem only shows if there are blank lines (or any skipped lines under the EmptyLineRule of the importoptions).
I've reported the bug.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Language Support 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by