Problem using regexp to extract certain lines

6 次查看(过去 30 天)
I am trying to extract lines that begin with /keylog/midi from a file that looks like:
/keylog/midi 144 60 72 1.001300
/keylog/oscp 144 60 0.006736 1.030209
/keylog/oscp 144 60 0.000000 2.852801
/keylog/oscp 144 60 0.000000 2.869148
/keylog/midi 144 60 0 2.870843
And I need to separate the two lines from each other. The code I have right now is:
Fid=fopen('keyData.txt');
myLines=fgetl(Fid);
while ~feof(Fid)
myLines=fgetl(Fid);
on=regexp(myLines,'(/keylog/midi)(\s\d+\s\d+\s[^0]\s\d+(.)\d+)', 'match'); % on must equal the line that's fourth line is greater than 0
off=regexp(myLines, '(/keylog/midi)\s\d+\s\d+\s[0]+\s\d+(.)\d+', 'match'); % off must equal the line that's fourth number is 0
end
When I run the code
off='/keylog/midi 144 60 0 2.870843'
on={}
What is wrong with my regexp for on?

采纳的回答

Simon
Simon 2013-11-22
Hi!
I would suggest another approach. It is (in my opinion) easier to debug, because you can track your steps easily.
% open and read file
fid = fopen(FileName);
FC = textscan(fid, '%s', 'delimiter', '\n', 'whitespace', '');
fclose(fid);
FC = FC{1};
% remove blanks on start and end of line
FC = strtrim(FC);
% find all lines with '/keylog/midi'
FCmidi = FC(strncmp('/keylog/midi', FC, 12));
% read each remaining line, skipping the string '/keylog/midi'
C = cellfun(@(x) sscanf(x, '%*s %d %d %d %f'), FCmidi, 'UniformOutput', false);
% format C: each column is one log entry
C = [C{:}];
% on/of flag is in row 3
onoff = C(3, :);

更多回答(2 个)

Walter Roberson
Walter Roberson 2013-11-22
[^0]\s should be [^0]\d* in order to eat the digits after the first non-zero one (e.g., [^0] will match the 7, and then the \d* will match the 2.
In the off expression, [0]+ will match one or more 0's. Will there ever be multiple 0's there, such as 00 ? If not then it would make more sense to get rid of the + and change the [0] to just 0
  3 个评论
Walter Roberson
Walter Roberson 2013-11-22
编辑:Walter Roberson 2013-11-22
Note: use \. to indicate a literal period.
Could you show your modified regular expressions?
John
John 2013-11-22
This is the code after I added the \d* after the [^0], and changed the (.) to \. The first one still does not work.
on=regexp(myLines,'(/keylog/midi)\s\d+\s\d+\s[^0]\d*\s\d+\.\d+', 'match');
off=regexp(myLines, '(/keylog/midi)\s\d+\s\d+\s0\s\d+\.\d+', 'match');

请先登录,再进行评论。


Yamoussa SANOGO
Yamoussa SANOGO 2019-10-15
编辑:Yamoussa SANOGO 2019-10-15
Hi there, I know this question has been around for a while, but I would add my suggestion in the case somebody else has the same problem. My approch would be a simple lookahead like this :
text =
" /keylog/midi 144 60 72 1.001300
/keylog/oscp 144 60 0.006736 1.030209
/keylog/oscp 144 60 0.000000 2.852801
/keylog/oscp 144 60 0.000000 2.869148
/keylog/midi 144 60 0 2.870843 " ;
rule = '(?<=\/keylog\/midi)(\s*\d*\s*\d*\s*\d*\.?\d*\s*\d*\.?\d*)' ;
matched_data = regexp(text,rule, 'match');
Then convert the matched data to string :
matched_data = [matched_data{:}];
This approch can be generalized by making the prefix '/oscp' and '/midi' a string variable and concatenate with the rest of the matching rule.

类别

Help CenterFile Exchange 中查找有关 String Parsing 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by