Regular expression. Are nesting of group operators supported?

5 次查看(过去 30 天)
Regarding Grouping Operators the function, regexp, doesn't behave the way I expected.
>> cac = regexp( 'ABC', '((A)(B(C)))', 'tokens' );
>> cac{1}(:)
ans =
1×1 cell array
{'ABC'}
regexp returns one token without any protests regarding my parentheses. I expected four: 'ABC', 'A', 'BC' and 'C'. The reason I expected that is because most other flavors of regular expressions would have returned four tokens. Java: Capturing Groups would
In the expression ((A)(B(C))), for example, there are four such groups:
  1. ((A)(B(C)))
  2. (A)
  3. (B(C))
  4. (C)
Another couple of examples
>> cac = regexp( 'ABC', '(A)(B(C))', 'tokens' );
>> cac{1}(:)
ans =
2×1 cell array
{'A' }
{'BC'}
>> cac = regexprep( 'ABC', '((A)(B(C)))', ' --- $1 ---' )
cac =
' --- ABC ---'
>> cac = regexprep( 'ABC', '((A)(B(C)))', ' --- $2 ---' )
cac =
' --- $2 ---'
The documentation on Grouping Operators is terse and there are only few examples. I've found nothing on "groups inside groups".
Question:
Are nesting of group operators supported or am I a victim of wishful thinking?

采纳的回答

Sean de Wolski
Sean de Wolski 2018-12-17
编辑:Sean de Wolski 2018-12-17
The Note below "Named Token Operator" indicates that the outermost will be captured, hence ABC and one token.
Note
If an expression has nested parentheses, MATLAB® captures tokens that correspond to the outermost set of parentheses. For example, given the search pattern '(and(y|rew))', MATLAB creates a token for 'andrew' but not for 'y' or 'rew'.
With string arrays, I'd recommend just creating an array of acceptable tokens:
cac = regexp("ABC", ["(ABC)","(A)", "(BC)", "(C)"], 'tokens' );
cac{:}
ans =
1×1 cell array
{["ABC"]}
ans =
1×1 cell array
{["A"]}
ans =
1×1 cell array
{["BC"]}
ans =
1×1 cell array
{["C"]}

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Cell Arrays 的更多信息

产品


版本

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by