How can i regexp this sample text?
7 次查看(过去 30 天)
显示 更早的评论
im looking to grab all the data after "CHANNEL" into a cell array such that C{1,1} = Shell Angle and so on
and looking to grab all the data after "UNITS" into a separate cell array such that D{1:1} = deg and so on
Im having trouble with the non alphanumeric stuff such as @, (, ), etc
Any help greatly appreciated, Thanks!
0 个评论
回答(2 个)
Stephen23
2019-3-28
编辑:Stephen23
2019-3-28
Here is a regexp-based solution which does NOT use hard-coded variable names (it automatically detects the variable names), and does NOT magically create variables in the MATLAB workspace... read this to know why dynamically naming variables is a bad way to write code:
The code is quite straighforward: the first regexp identifies the variable name and square bracket pairs, the second regexp identifies the list values themselves:
str = fileread('sample2.txt');
tkn = regexp(str,'(\w+)\s+=\s+\[([^\]]+)\]','tokens');
tkn = vertcat(tkn{:});
hdr = tkn(:,1);
tkn = regexp(tkn(:,2),'''([^'']+)''','tokens');
tkn = cellfun(@(c)[c{:}],tkn,'uni',0);
And checking:
>> hdr{2}
ans =
UNIT
>> tkn{2}
ans =
Columns 1 through 8
'deg' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 9 through 16
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 17 through 20
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
You can access the data directly from the cell arrays, or use them to define a convenient structure:
>> S = cell2struct(tkn,hdr,1);
>> S.UNIT
ans =
Columns 1 through 8
'deg' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 9 through 16
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 17 through 20
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
>> S.CHANNEL
ans =
Columns 1 through 8
'Shell Angle' [1x30 char] [1x30 char] [1x30 char] [1x30 char] [1x30 char] [1x30 char] [1x30 char]
Columns 9 through 16
[1x30 char] [1x30 char] [1x31 char] [1x31 char] [1x31 char] [1x31 char] [1x31 char] [1x31 char]
Columns 17 through 20
[1x31 char] [1x31 char] [1x31 char] [1x31 char]
2 个评论
Guillaume
2019-3-28
编辑:Guillaume
2019-3-28
>> map = containers.Map(hdr, tkn);
>> map('UNIT')
ans =
1×20 cell array
Columns 1 through 3
{'deg'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 4 through 6
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 7 through 9
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 10 through 12
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 13 through 15
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 16 through 18
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 19 through 20
{'lbf/(ft.s)'} {'lbf/(ft.s)'}
Whatever you do, do not use values coming out of a text file to name actual variables. Creating variables dynamically is a sure way to have bugs, you may well overwrite an existing variable if for some reason the input file is slightly different.
Adam Danz
2019-3-21
编辑:Adam Danz
2019-3-28
In this solution, the file is first read into matlab as a char array. Then I segregate the "CHANNEL" variable and convert it into Matlab syntax so it can be converted to a cell array. The same is then done with the "UNIT" variable.
file = 'C:\Users\name\Documents\MATLAB\sample2.txt';
c = fileread(file);
% segregate "Channel" array
chanTxt = regexp(c,'CHANNEL.+?\]', 'match');
% Replace ampersands with elipses
chanTxt = strrep(chanTxt, '&', '...');
% Replace square bracket with curly brackets
chanTxt = strrep(chanTxt, '[', '{');
chanTxt = strrep(chanTxt, ']', '}');
% Execute string to convert it to cell array of strings
eval(chanTxt{:}); % Now you have a variable named "CHANNEL"
% CHANNEL' % (first 5 lines)
% ans =
% 20×1 cell array
% {'Shell Angle' }
% {'Surface Heat Flux @ GridLine=1' }
% {'Surface Heat Flux @ GridLine=2' }
% {'Surface Heat Flux @ GridLine=3' }
% {'Surface Heat Flux @ GridLine=4' }
% {'Surface Heat Flux @ GridLine=5' }
% Now go through the same process for "UNIT"
unitTxt = regexp(c,'UNIT.+?\]', 'match');
unitTxt = strrep(unitTxt, '&', '...');
unitTxt = strrep(unitTxt, '[', '{');
unitTxt = strrep(unitTxt, ']', '}');
eval(unitTxt{:}); % Now you have a variable named "UNIT"
A great work station to develop regular expressions: https://regex101.com/
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Multirate Signal Processing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!