Strfind to contain complex pattern

1 次查看(过去 30 天)
I have created the following program to search for sentences. I want to include those that only begin with vowels.
a = 'John played volleyball. I love Anna. Are you there?'
b = strfind(a,'.')
y{1} = a(1:b(1))
for i=2:length(b)
y{i} = a(b(i-1)+1:b(i))
end
y{i+1} = a((b(i)+1):end)

采纳的回答

Cedric
Cedric 2018-1-3
编辑:Cedric 2018-1-3
Here is an approach:
str = 'John played volleyball. I love Anna. Are you there?' ;
buffer = strtrim( strsplit( str, {'.', '?', '!'} )) ;
for k = numel( buffer ) : -1 : 1
if isempty( buffer{k} ) || ~any( upper(buffer{k}(1)) == 'AEIOUY' )
buffer(k) = [] ;
end
end
Running this, you get:
>> buffer
buffer =
1×2 cell array
{'I love Anna'} {'Are you there'}
If you don't understand, evaluate the following expression independently, and analyze their output:
strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} )
buffer = strtrim( strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} ))
upper(buffer{1}(1))
upper(buffer{1}(1)) == 'AEIOUY'
any( upper(buffer{1}(1)) == 'AEIOUY' )
upper(buffer{2}(1))
upper(buffer{2}(1)) == 'AEIOUY'
any( upper(buffer{2}(1)) == 'AEIOUY' )
PS: this could also be done using regular expressions, but more classic approaches (like the above) should be understood first.
PS2: your first attempt is good actually. You try to implement STRSPLIT and it works to some extent; it was good training, but it would have to be extended to support multiple delimiters. If you run the expressions above, you will realize that STRSPLIT does the split (outputs a cell array). You may have to use another approach if you need to keep the delimiters (.?!) though (see PS3). STRSPLIT leaves the leading and trailing spaces, which is a problem if you want to test the first character, hence the call to STRTRIM. The output of STRTRIM is a cell array of "clean" sentences. The loop removes all entries that don't start with a vowel. It needs to go backwards from the end of the array, otherwise indexing is messed up.
PS3: If you need to keep the punctuation, replace the call to STRSPLIT with a call to REGEXP, as follows:
buffer = strtrim( regexp( str, '.*?[\.!\?]', 'match' )) ;

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Characters and Strings 的更多信息

标签

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by