How to replace parts of the text using regexprep

6 次查看(过去 30 天)
Hi all,
I have a very large text file which I imported as a char. vector which whas the following pattern:
text=NEW SCOMPONENT /JAFHB0099
DESC 'FLANGE F7805 SLIP-ON 10K FF 900A'
GTYP FLAN
PARA 900 1095 56 FBIA BWD 13
END
NEW SCOMPONENT /JAFHB00aa
DESC 'FLANGE F7805 SLIP-ON 10K FF 1100A'
GTYP FLAN
PARA 1100 1225 18 FBIA BWD 14
END
I want to replace the parts after DESC and PARA with some of my own values, e.g.
nDESC = {'Description 1'; 'Description 2'} ;
nPARA = {'1500 15300 20 FBDIA BWD 14' ; '1600 1623 20 FBDIA SWM 13'} ;
For the above, I have developed the following code, also with the help of the MATLAB community which let me know about the regexp function:
%Extracts what lies after the word PARA in the PARA line & Replaces it with the nPARA
newtext = regexprep(text, 'PARA\s+(\d+\.?\d*\s+\d+\.?\d*\s+\d+\.?\d*\s+\w*\s+w*\s+\d+\.?\d*)', nPARA) ;
I follow a similar logic for the case of the DESC.
However 2 problems occur.
1. The parenthesis after the \s+ and \w* for some reason do not capture the tokens only in the parenthesis after the PARA word, which instead of returning 1100 1225 18 FBIA BWD 14, I get PARA 1100 1225 18 FBIA BWD 14. However I can work around this so its not a big deal, I should be missing something out there.
2. The result that I get from the above does not replace each individual line with each string in the cell array, however it takes the last cell in the cell array and replaces every line with that cell.
  1 个评论
Stephen23
Stephen23 2018-11-5
编辑:Stephen23 2018-11-5
1) regexprep does not replace the tokens, it replaces the matched substring. So what you see is the expected behavior. Tokens are entirely optional, and can be used in dynamic operations. But the entire matched substring is replaced. You could resolve this using a look-around operation.

请先登录,再进行评论。

采纳的回答

Stephen23
Stephen23 2018-11-5
编辑:Stephen23 2018-11-5
This uses a slightly different approach using regexp and strncmp, which is based on the assumption that each command is on its own line. You did not supply an example file so I created one (attached).
>> nDESC = {'Description 1'; 'Description 2'};
>> nPARA = {'1500 15300 20 FBDIA BWD 14' ; '1600 1623 20 FBDIA SWM 13'};
>> S = fileread('temp1.txt')
S = NEW SCOMPONENT /JAFHB0099
DESC 'FLANGE F7805 SLIP-ON 10K FF 900A'
GTYP FLAN
PARA 900 1095 56 FBIA BWD 13
END
NEW SCOMPONENT /JAFHB00aa
DESC 'FLANGE F7805 SLIP-ON 10K FF 1100A'
GTYP FLAN
PARA 1100 1225 18 FBIA BWD 14
END
>> C = regexp(S,'^\s*([A-Z]+\s*)(.*)$','tokens','dotexceptnewline','lineanchors');
>> C = vertcat(C{:}).';
>> C(2,strncmp(C(1,:),'DESC',4)) = nDESC;
>> C(2,strncmp(C(1,:),'PARA',4)) = nPARA;
>> Z = sprintf('\n%s%s',C{:});
>> Z = Z(2:end)
Z = NEW SCOMPONENT /JAFHB0099
DESC Description 1
GTYP FLAN
PARA 1500 15300 20 FBDIA BWD 14
END
NEW SCOMPONENT /JAFHB00aa
DESC Description 2
GTYP FLAN
PARA 1600 1623 20 FBDIA SWM 13
END

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by