How to extract numeric data between string lines?

3 次查看(过去 30 天)
Hi MATLAB Community
I'm trying to solve this problem, which for sure is not new, but I haven't been able to find a proper solution.
I have a file with several headlines, and then a lot of information in the following way:
Binning n: 1, "De19 ", Event #: 150, Primary(s) weight 1.0000E+00
Number of hit cells: 0
Binning n: 1, "De19 ", Event #: 151, Primary(s) weight 1.0000E+00
Number of hit cells: 1
1 7.185244612628594E-05
Binning n: 1, "De19 ", Event #: 152, Primary(s) weight 1.0000E+00
Number of hit cells: 0
Binning n: 1, "De19 ", Event #: 153, Primary(s) weight 1.0000E+00
Number of hit cells: 0
As shown, sometimes after the "Number of hit cells" line, there are numbers. I would like to extract them in a matrix or array. Is there a way to do this?
I attached an example file, that usually contains a lot more of data, that I erased for weight questions.
Thank you very much in advance

采纳的回答

Stephen23
Stephen23 2021-1-27
编辑:Stephen23 2021-1-27
str = fileread('02-2021-Clearance-Box005_fort72.txt');
rgx = '(?<=Number of hit cells:\s+\d+\s+)(\d+[^\n]*)';
tmp = regexp(str,rgx,'match')
tmp = 1x2 cell array
{'1 7.185244612628594E-05'} {'1 2.547905314713717E-04'}
vec = cellfun(@(s)sscanf(s,'%f',[1,Inf]),tmp,'uni',0) % convert to numeric
vec = 1x2 cell array
{1×2 double} {1×2 double}
mat = vertcat(vec{:}) % optional merge into one numeric matrix
mat = 2×2
1 7.1852e-05 1 0.00025479
  4 个评论
Federico Geser
Federico Geser 2021-1-27
Hi Stephen!
I think it works, but the test file has 12 MB of info to filter, so it might take a while. I don't know if this will work when I get the real results (that may weight ca. 100 MB).
Nevertheless, very helpful solution! Thank you!
Stephen23
Stephen23 2021-1-27
编辑:Stephen23 2021-1-27
If there are always exactly two numbers on each of those lines, then this is probably more efficient:
str = fileread('02-2021-Clearance-Box005_fort72.txt');
rgx = '(?<=Number of hit cells:\s+\d+\s+)(\d+[^\n]*)'; % unchanged
tmp = regexp(str,rgx,'match'); % unchanged
mat = sscanf(sprintf(' %s',tmp{:}),'%f',[2,Inf]).'
mat = 2×2
1 7.1852e-05 1 0.00025479

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Data Type Conversion 的更多信息

产品


版本

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by