Efficient way to use regexp and contains and matching

19 次查看(过去 30 天)
Hello!
I am using the following code to match two cell array contents. But it takes way too long to process. Can anybody suggest a better way to code the same thing?
validVar={};
str = {'abc==';'bac[2]';'fuh[2]';'fgh'};
list={'abc(1)';'cde';'fgh'};
for x=1:numel(list)
expression = sprintf('%s..',list{x});
for y=1:numel(str)
if ~isempty(regexp(str{y},expression,'match')) || contains(str{y},list{x})
validVar=[validVar;list{x}];
end
end
end
Also, the result gives me validVar with 'fgh' only but I want 'abc(1)' in the list as well since it is a part of str{1}. Is there a way to match the entries of list with str in such a way that even if a part of list entry matches part of str entry then it should be listed under validVar.
  3 个评论
Tiasa Ghosh
Tiasa Ghosh 2018-9-7
My mistake. I have edited the question with examples and more bugs. I am using regexp as one search pattern so that the expression match to be used can have extended parts and the condition turns true even if a part of the string matches with part of the input.
Greg
Greg 2018-9-8
First, your performance is suffering because you're looping over both lists. Both regexp and contains will work on a vector with a scalar, removing one of the loops.
Second, if you know how to use regexp expertly (this is not a dig - regexp is extremely powerful but even more difficult to master), you could do all of your checking with one expression.
Finally, your requirements are very ill-formulated. What in the word does "... even if a part of list entry matches part of str entry" mean? A part of bac[2] matches a part of cde - the c character. I'm sure this isn't what you had in mind, so you need more explicit rules for validVar.

请先登录,再进行评论。

采纳的回答

Guillaume
Guillaume 2018-9-10
Right now, your double loops can be simplified to:
validVar = {};
for x = 1:numel(list)
if ~isempty(cell2mat(regexp(str, [list{x}, '(..)?'], 'once'))) %match either list{x} or list{x} followed by any two characters.
validVar = [validVar; list{x}];
end
end
which should be a lot faster.
However, I don't think that's exactly what you want. I agree with greg that it's not really clear what it is exactly you want. We need a very clear rule of what patterns you want to match and not match with the regex.
  1 个评论
Tiasa Ghosh
Tiasa Ghosh 2018-9-12
I realised where the question went vague and somehow the question isn't relevant to me now. Anyhow, thank you for your time and answer. Hope it helps somebody else in future . :)

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by