Trouble with textscan and large .dat files

Question

kschau 2013-5-8

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/75078-trouble-with-textscan-and-large-dat-files

I am trying to import specific values from a very large .dat file (use dummy.dat).

These values are in a single column, that is extremely long (700000 rows). I am trying to pick out specific values within this column and then move on without importing the whole column.

When I use

A = importdata('dummy.dat')

I get a nice [700000x 1] array in my workspace, so that works but again, I don't want to take the time to import the whole thing.

When I use

fid=fopen('dummy.dat');  
A = textscan(fid,%f,'delimiter','')

I get a 1 x 1 cell in which the cell is a [700000 x 1] double, so that works, but I am still importing the whole thing.

Say I want to pick out the number that is in the 5th row, and only that number. I am trying:

fid=fopen('dummy.dat');  
A = textscan(fid,%f,1,'delimiter','','headerlines',4)

For some reason, when I do this, the single column nature of the .dat file is changed into 4 columns so instead of reading

1
2
3
4
5
6...

I get

1 2 3 4
5 6 ...

Which is screwing up my rows and headerlines and what values I am reading.

Anyone know whats going on here?

Thanks.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Walter Roberson 2013-5-8

What is your intention in setting the delimiter to '' ? Why not just leave the delimiter unspecified ?

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Walter Roberson 2013-5-8

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/75078-trouble-with-textscan-and-large-dat-files#answer_84807

If you are importing the same file multiple times, I suggest reading it once and writing a version of it in binary. Then, each time you want to read, knowing which position you want to start at, you can fseek() to the (position - 1) * (the size in bytes of a single entry) and fread() from there.

2 个评论
显示无隐藏无

kschau 2013-5-8

I would but unfortunately I need to extract a few data points from one .dat file and then move on to another many many times.

kschau 2013-6-11

Trick was to just compile ALL the files into one long binary string and then just remember byte sequence to jump quickly between what were separate .dat files. Thanks for the advice!

请先登录，再进行评论。

Answer 2

Gabriel 2013-6-11

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/75078-trouble-with-textscan-and-large-dat-files#answer_88477

If you don't care about speed at all, The easiest way is to use fgetl to read each line, then textscan on each line to grab what you want. Slow but easy.