Extracting data using regular expression

Question

Shuvashish Roy 2021-5-20

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/835638-extracting-data-using-regular-expression

评论： Shuvashish Roy 2021-5-21

AR_20base_201214_adh.txt

Hi,

I have the attached text file. I want to extract all the columns starting from line 1472(if used notepad) named "Physics", "Time", "dt", "Progress", "Nonlinear Iteration" "Linear Iterations"...."Nodes After Adaption". I don't know how to specify the header names so that only the numeric values after that headers are extracted in a dataframe or matrix format. Thanks a lot for your help.

Input file format:

Unnecessary lines with text

Unnevessary lines with text

................................

many unnecessay lines............

adh_run_func :: tfinal = 12513600.000000

Physics Time dt Progress Nonlinear Iteration Linear Iteration Max Resid Norm ... Nodes After Adaption

HYD_1 11908800 5 0 1 ........ ...65926

HYD_1 11908800 5 0 2 ...... ...65926

............................................................................................. ................................

100% COMPLETE

output file format:

Physics Time dt Progress Nonlinear Iteration Linear Iteration Max Resid Norm ... Nodes After Adaption

HYD_1 11908800 5 0 1 ........ ...65926

HYD_1 11908800 5 0 2 ...... ...65926

............................................................................................. ................................

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

per isakson 2021-5-21

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/835638-extracting-data-using-regular-expression#answer_704978

编辑：per isakson 2021-5-21

在 MATLAB Online 中打开

AR_20base_201214_adh.txt

"all the columns [...] named "Physics", "Time", "dt", "Progress", "Nonlinear Iteration" "Linear Iterations"...."Nodes After Adaption" " I understand that as all the columns, none excluded.

There is a choice. Shall we use readtable() or textscan()? I don't think readtable() can handle this file without relying on the critical line numbers, which I hessitate to do. It is however possible to determine the line numbers needed in a separate step and then use readtable(). textscan() is able to parse a 1D character array, which readtabe() is not. Only TMW knows why.

I choose textscan().

%%  Read file
chr = fileread('AR_20base_201214_adh.txt');
%%  Remove meta data
%   Using 'adh_run_func :: tfinal' feels more robust than using the line number 
pos = regexp( chr, '^adh_run_func :: tfinal', 'once', 'lineanchors' );
chr(1:pos-1) = [];  % remove until the first line that begins with 'adh_run_func :: tfinal' 
%%  Remove the summary lines at the end
pos = regexp( chr, '^\d+[\% ]+COMPLETE', 'once', 'lineanchors' );
chr(pos:end) = [];
%%  Get the column headers
txt = regexp( chr, '^Physics.+?$', 'match', 'once', 'lineanchors' );
column_headers = strsplit( txt, '\t' );
%%
cac = textscan( chr, ['%s',repmat('%f',1,numel(column_headers)-1)]  ...
            ,   'Headerlines'   , 2     ...     two remains after meta-data is removed
            ,   'Delimiter'     , '\t'  ...
            ,   'Whitespace'    , ' %'  ...     ignore the %-sign in Progress
            ,   'CollectOutput' , true  );
Physics = cac{1};    
matrix  = cac{2};
whos Physics matrix column_headers
  Name                    Size              Bytes  Class     Attributes

  Physics             13487x1             1537454  cell                
  column_headers          1x17               2026  cell                
  matrix              13487x16            1726336  double              

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Shuvashish Roy 2021-5-21

Per Isakon,

I got your answer.It worked! You are awesome. Thanks a lot both you and Stephen for your valueable times.

请先登录，再进行评论。

Extracting data using regular expression

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

Extracting data using regular expression

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论