Regular expression. Are nesting of group operators supported?

12 次查看(过去 30 天)
Regarding Grouping Operators the function, regexp, doesn't behave the way I expected.
>> cac = regexp( 'ABC', '((A)(B(C)))', 'tokens' );
>> cac{1}(:)
ans =
1×1 cell array
{'ABC'}
regexp returns one token without any protests regarding my parentheses. I expected four: 'ABC', 'A', 'BC' and 'C'. The reason I expected that is because most other flavors of regular expressions would have returned four tokens. Java: Capturing Groups would
In the expression ((A)(B(C))), for example, there are four such groups:
  1. ((A)(B(C)))
  2. (A)
  3. (B(C))
  4. (C)
Another couple of examples
>> cac = regexp( 'ABC', '(A)(B(C))', 'tokens' );
>> cac{1}(:)
ans =
2×1 cell array
{'A' }
{'BC'}
>> cac = regexprep( 'ABC', '((A)(B(C)))', ' --- $1 ---' )
cac =
' --- ABC ---'
>> cac = regexprep( 'ABC', '((A)(B(C)))', ' --- $2 ---' )
cac =
' --- $2 ---'
The documentation on Grouping Operators is terse and there are only few examples. I've found nothing on "groups inside groups".
Question:
Are nesting of group operators supported or am I a victim of wishful thinking?

采纳的回答

Sean de Wolski
Sean de Wolski 2018-12-17
编辑:Sean de Wolski 2018-12-17
The Note below "Named Token Operator" indicates that the outermost will be captured, hence ABC and one token.
Note
If an expression has nested parentheses, MATLAB® captures tokens that correspond to the outermost set of parentheses. For example, given the search pattern '(and(y|rew))', MATLAB creates a token for 'andrew' but not for 'y' or 'rew'.
With string arrays, I'd recommend just creating an array of acceptable tokens:
cac = regexp("ABC", ["(ABC)","(A)", "(BC)", "(C)"], 'tokens' );
cac{:}
ans =
1×1 cell array
{["ABC"]}
ans =
1×1 cell array
{["A"]}
ans =
1×1 cell array
{["BC"]}
ans =
1×1 cell array
{["C"]}

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Programming 的更多信息

产品


版本

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by