Info

此问题已关闭。 请重新打开它进行编辑或回答。

Load the numeric data of a cyclic text file into a matrix

2 次查看(过去 30 天)
Dear All,
I guess I have to rephrase my question since it has not receive much attention.
I have a text file in the following format:
ITEM: TIMESTEP
0
ITEM: NUMBER OF ATOMS
200
ITEM: BOX BOUNDS pp pp pp
0 23.5
0 23.5
0 23.5
ITEM: ATOMS id type x y z
1 1 4.629738099 19.15100895 8.591289203
2 1 5.379313371 19.12269554 8.727806695
3 2 7.531762324 13.25286645 4.981542453
4 2 7.427444873 13.99400029 5.110889318
ITEM: TIMESTEP
5
ITEM: NUMBER OF ATOMS
200
ITEM: BOX BOUNDS pp pp pp
0 23.5
0 23.5
0 23.5
ITEM: ATOMS id type x y z
1 1 4.602855537 28 8.610593144
2 1 5.399314789 19.12299845 8.70663802
3 2 7.539913654 13.25759311 4.99833023
4 2 7.479249704 13.99259535 5.137606665
The file contains of 6000000 of these cycles. I need to export the numeric data corresponding to the last three columns of each cycle into a matrix for all of the cycles.
In other words my desired output matrix should be in the following format:
4.629738099 19.15100895 8.591289203
5.379313371 19.12269554 8.727806695
7.531762324 13.25286645 4.981542453
7.427444873 13.99400029 5.110889318
4.602855537 28.00000000 8.610593144
5.399314789 19.12299845 8.70663802
7.539913654 13.25759311 4.99833023
7.479249704 13.99259535 5.137606665
As you can see the first 9 lines of each cycle was ignored and added cycles in order to have a target matrix.I do not like to print out this matrix, I just need it for further calculations. I hope you can help me. Thanks

回答(1 个)

dpb
dpb 2015-9-17
No matter what you do it likely is going to take a while if the file is that large. But, reading it is pretty straightforward...
fmt=[repmat('%*d',1,2) repmat('%f',1,3)];
N=4; % for the file as shown; I guess it would be 200 for the real file?
fid=fopen('yourfile');
i=0;
while ~feof(fid)
c{i,1}=textscan(fid,fmt,N,'headerlines',9,'collectoutput',1);
end
c=cell2mat(c);
You may speed it up some by preallocating a large "ordinary" array of Nx3, N = #atoms*groups if known and offsetting each portion read by 200 on each pass. Here I would then wrap the textscan call inside cell2mat to convert directly.
N=200; % could open file and read this, too first...
M=6000000; % # time steps in file...
fid=fopen('yourfile');
i1=1; i2=N; % initial indices to array rows
do i=1:6000000
d{i1:i2,:}=cell2mat(textscan(fid,fmt,N, ...
'headerlines',9,'collectoutput',1));
i1=i2+1; i2+i2+N; % increment
end
fid=fclose(fid);
Of course,
>> 6000000 * 200 * 8/1024/1024/1024
ans =
8.9407
>>
9 GB may be more than you can hold in memory at once...

此问题已关闭。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by