Trouble with textscan and large .dat files

2 次查看(过去 30 天)
I am trying to import specific values from a very large .dat file (use dummy.dat).
These values are in a single column, that is extremely long (700000 rows). I am trying to pick out specific values within this column and then move on without importing the whole column.
When I use
A = importdata('dummy.dat')
I get a nice [700000x 1] array in my workspace, so that works but again, I don't want to take the time to import the whole thing.
When I use
fid=fopen('dummy.dat');
A = textscan(fid,%f,'delimiter','')
I get a 1 x 1 cell in which the cell is a [700000 x 1] double, so that works, but I am still importing the whole thing.
Say I want to pick out the number that is in the 5th row, and only that number. I am trying:
fid=fopen('dummy.dat');
A = textscan(fid,%f,1,'delimiter','','headerlines',4)
For some reason, when I do this, the single column nature of the .dat file is changed into 4 columns so instead of reading
1
2
3
4
5
6...
I get
1 2 3 4
5 6 ...
Which is screwing up my rows and headerlines and what values I am reading.
Anyone know whats going on here?
Thanks.
  1 个评论
Walter Roberson
Walter Roberson 2013-5-8
What is your intention in setting the delimiter to '' ? Why not just leave the delimiter unspecified ?

请先登录,再进行评论。

采纳的回答

Walter Roberson
Walter Roberson 2013-5-8
If you are importing the same file multiple times, I suggest reading it once and writing a version of it in binary. Then, each time you want to read, knowing which position you want to start at, you can fseek() to the (position - 1) * (the size in bytes of a single entry) and fread() from there.
  2 个评论
kschau
kschau 2013-5-8
I would but unfortunately I need to extract a few data points from one .dat file and then move on to another many many times.
kschau
kschau 2013-6-11
Trick was to just compile ALL the files into one long binary string and then just remember byte sequence to jump quickly between what were separate .dat files. Thanks for the advice!

请先登录,再进行评论。

更多回答(1 个)

Gabriel
Gabriel 2013-6-11
If you don't care about speed at all, The easiest way is to use fgetl to read each line, then textscan on each line to grab what you want. Slow but easy.

类别

Help CenterFile Exchange 中查找有关 Large Files and Big Data 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by