Reading a huge text file using Textscan()
9 次查看(过去 30 天)
显示 更早的评论
I have a .txt file with 3235K number of lines, and 25 space delimited columns. I use the following code to read the file:
fread = fopen('data.txt', 'r');
formatSpec ='%f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f';
C = textscan(fread, formatSpec);
A = [C{:}];
fclose(fread)
My problem is, it only reads up to 249K lines. A is 249000x25 double.
This is the output when I typed 'memory':
Maximum possible array: 55317 MB (5.800e+10 bytes) *
Memory available for all arrays: 55317 MB (5.800e+10 bytes) *
Memory used by MATLAB: 859 MB (9.002e+08 bytes)
Physical Memory (RAM): 8101 MB (8.494e+09 bytes)
Is there any way I can read the whole file?
0 个评论
采纳的回答
Star Strider
2015-8-15
The file itself may be the problem. There could be a text line in it that is stopping textscan.
Try this to see:
C = textscan(fread, formatSpec);
A = [C{:}];
fseek(fread,0,0);
C = textscan(fread, formatSpec, 'HeaderLines',1);
A = [A{:}; C{:}];
You can also use the fgets or fgetl to see what the next lines are, and then change the subsequent textscan calls to make it work with your file.
NOTE: You are inadvertently ‘shadowing’ (or ‘overshadowing’) the fread function. This could be a problem if you need it in your code later. Rename your file ID ‘fidin’ or some such instead to avoid the problem.
2 个评论
Star Strider
2015-8-15
Thank you.
The fseek call repositions the file pointer, essentially starting it from the point where textscan stopped reading it. The second ‘A’ assignment simply concatenates ‘A’ with its previous data and the new data.
I just wrote one iteration for illustration purposes. You can put the second and subsequent iterations into a loop if you need to.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Text Files 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!