Is it possible to split a large text file into half and subsequently use textscan for both parts?
3 次查看(过去 30 天)
显示 更早的评论
Hi,
This is my first time in this forum.
I am working on a large text file containing a large number of data 10^5 * 600 of 16-digit elements. I use the textscan command to read a string data. I already known the number of columns, so I am able to generate a format spec beforehand. The main part of my code is shown below:
array=textscan(fileID,Spec,NumRow,'Delimiter',delim,'MultipleDelimsAsOne',true,'HeaderLines',1,'ReturnOnError',false);
When I specify the NumRow (number of rows) as 50000 or below, it works fine and only took about 1 minute to run. However, my system seems to crash when I increase the NumRow to 100,000. I suspect that my virtual memory has reached its limit.
Therefore, I wonder that is there a way I can split the data into two parts. Say, from the 1st -50,000th row and 50000th -100000th row
Thanks! Ati
3 个评论
回答(2 个)
per isakson
2013-5-13
编辑:per isakson
2013-5-13
Something like this
nRow = 50000;
fid = fopen( ... )
buf1 = textscan( fid, ..., nRow, .... );
....
buf2 = textscan( fid, ..., nRow, .... );
fclose( fid );
3 个评论
per isakson
2013-5-14
编辑:per isakson
2013-5-14
You have to process the data in buf1 and
clear buf1
before reading the rest of the file. Or
buf = textscan( fid, ..., nRow, .... );
....
buf = textscan( fid, ..., nRow, .... );
I guess, I would have written the data to one or more binary files and used memmapfile to work with the data.
Walter Roberson
2013-5-14
per is correct.
To be explicit, textscan() does not read in the entire file when you specify the repeat count.
Yao Li
2013-5-14
You can use for loops to auto-generate the formatSpec for textscan(). For example, you can read two column at a time by defining formatSpec as:
for j=1:300
for k=1:600
temp{k}='%*f';
end
temp{2*j}='%f';
temp{2*j-1}='%f';
formatSpec_array{j}=strcat(temp{1},temp{2});
for i=3:600
formatSpec_array{j}=strcat(formatSpec_array{j},temp{i});
end
end
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Text Files 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!