Regexp: different behavior for the same type of expressions

I want to capture everything except the tokens:
name, '_' and '.iv2'
name =
A7122
>> filename'
ans =
'A7122_60a.iv2'
'A7122_60b.iv2'
'A7122_70a.iv2'
'A7122_70b.iv2'
'A7122_90a.iv2'
'A7122_90b.iv2'
'A7122_100.iv2'
'A7122_120.iv2'
I do this:
str=regexp(filename, [ '(?:[^' name '_])\w*(?:[^.iv2])' ], 'match');
And the answer is the following!
>> celldisp(str)
str{1}{1} =
60a
str{2}{1} =
60b
str{3}{1} =
0a
str{4}{1} =
0b
str{5}{1} =
90a
str{6}{1} =
90b
str{7}{1} =
00
str{8} =
{}
I don't understand why regexp has a different behavior for i.e. in filename(1) and filename(3)

2 个评论

Your mistake is that [^A7122_] doesn't stand for "any six letters expression that is not 'A7122 _'", but instead for "any character which is not in the pool of literals {'A', '7', '1', '2', '_' }". The same applies to [^.iv2]. This is why 70a and 70b for example are not matched, but you get instead 0a and 0b.
Thank you very much Cedric! Very nice explanation!

请先登录,再进行评论。

 采纳的回答

You can use:
regexprep( filename, [ name, '_|.iv2'], '')
Also
regexp( filename, [ name, '_(\w*).iv2'], 'tokens')

1 个评论

Thank you Vishal! Much more compact syntax! However, do you know why regexp has this behavior?

请先登录,再进行评论。

更多回答(1 个)

str = {'A7122_60a.iv2'
'A7122_60b.iv2'
'A7122_70a.iv2'
'A7122_70b.iv2'
'A7122_90a.iv2'
'A7122_90b.iv2'
'A7122_100.iv2'
'A7122_120.iv2'}
cl = regexp(str,'(?<=_)\w*(?=\.)','match');
out = cat(1,cl{:});

2 个评论

Thank you Andrei! Do you know why is this happening?
Please read about regexp, parts:
- Regular expression :
about
Metacharacters ( \w ),
Quantifiers ( expr* ),
Lookaround Assertions ( expr(?=test) and (?<=test)expr )

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Characters and Strings 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by