Efficient identification of quoted substrings in a substring
3 次查看(过去 30 天)
显示 更早的评论
I'm looking for help from the matlab string parsing experts out there to help we come up with a computationally efficient way (perhaps using regular expressions), to identify the quoted parts of a string from random sources of text (e.g. a journal article). The method needs to work regardless of whether the quoted substrings are contained inside single or double quotes. Further the text may contain apostrophes either inside or outside of the of the quoted substrings.
For example, in this sentence:
Sally said "It's a wonderful life" when she heard Molly's sister proclaim "It's a great day".
I would like to identify "It's a wonderful life" and "It's a great day", while in this text:
The attributes of the <table> tag were 'width=80%' and 'align="center"'.
I would like to identity 'width=80%' and 'align="center"'. [Note, I purposedly did not show the above example sentences in matlab code, but rather just showed them as free text, so as to not to confuse my question with how to properly capturing such sentences in a matlab variable.]
I recognize these examples are a bit pedantic, but since the code won't be able to control the source of the text it is searching, it needs to be robust across these cases.
I have been able to do this with a "brute force" linear search through the text, but its pretty inefficient and complex. I am not enough of an regexp expert to figure out a way to do this with regular expressions, but I've seen such experts come up with pretty elegant and efficient solutions to such problems. Hence, I was hoping my case might be tantilizing to one of those experts in this community. Thanks for any suggestions
采纳的回答
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Characters and Strings 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!