extract values in text document
1 次查看(过去 30 天)
显示 更早的评论
Hi all,
I'd like to screen a text-dokument (with numeric values and character-strings in a header section which is repeated unfrequently (not periodicaly after exact N rows); and numeric values in form of a matrix underneath the header) and collect all values after a certain string.
to be more clear, here an example of the textfile I want to process:
ITEM: TIMESTEP
1
ITEM: NUMBER OF ATOMS
1000
ITEM: BOX BOUNDS
-1 1
-1 1
-0.1 2
[1......
2......
.......
999....
1000...]
ITEM: TIMESTEP
2
ITEM: NUMBER OF ATOMS
1005
ITEM: BOX BOUNDS
-1 1
-1 1
-0.1 2
[1......
2......
.......
1004...
1005...]
... and so on...
I'd like to extract the number of atoms within different timesteps, which means: I want to create an array, which stores all the values, that follow the string
"ITEM: NUMBER OF ATOMS"
in the text document (in the example it's the values 1000 and 1005).
How can I do that?
Thanks very much for your help! regards
Sebastian
0 个评论
回答(1 个)
Ken Atwell
2012-4-12
For a customer file type like this, I would use a regular expression (the MATLAB function regexp) to scan the file. regexp can be a little daunting to the uninitiated, so here is a little code to get you started.
%%Read the data file
f = fopen('atomdata.txt');
t = fread(f, 'char=>char');
t=t';
fclose (f);
%%Scan for atom counts
numAtoms = regexp(t, 'ITEM: NUMBER OF ATOMS\W+([0-9]+)', 'tokens')
This will give you a cell array of text strings, which you may need to further convert to double using str2double or similar.
2 个评论
Ken Atwell
2012-4-12
You can use fgetl in a loop to read the file line-by-line, looking for "NUMBER OF ATOMS"'... knowing that the following line is the piece of data you are looking for.
I still contend that regexp will get you what you're looking for, probably in one line of code and certainly without a loop.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Text Files 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!