How to take first Character if name starts with specified string

7 次查看(过去 30 天)
Hi,
I have below cell array,
STA.TH01E2RT00
STA.J01E2RT00R99
STA.TB01E2RT00
Deafult.TY04KT00
Cond.TH09K12E3T00R15
STA.TH01E2RT00
PAR_STA.TH01E2RT00
I want to extract the first English character if the name starts with "STA.", If the first Character is "T", then take next character.
My desired outPut:
H
J
B
Deafult.TY04KT00
Cond.TH09K12E3T00R15
H
PAR_STA.TH01E2RT00
  2 个评论
John D'Errico
John D'Errico 2018-4-27
Time to learn tools like strtok. Test for the substring 'STA' using isequal. Its just a couple of tests, really. Write a function that will extract the desired result on any string. Then apply to each cell of the array. So cellfun would suffice.
Why not try it?
Wick
Wick 2018-5-1
编辑:Wick 2018-5-1
You can also use sscanf in a nested manner. Something like:
Y = sscanf(string_to_check,'STA.%s')
Y will be empty if the string doesn't start with STA.
if Y(1) is 'T' then make another sscanf for the rest of the string and grab the first letter after that.

请先登录,再进行评论。

回答(3 个)

Stephen23
Stephen23 2018-5-2
编辑:Stephen23 2018-5-2
One solution:
C = {...
'STA.TH01E2RT00'
'STA.J01E2RT00R99'
'STA.TB01E2RT00'
'Deafult.TY04KT00'
'Cond.TH09K12E3T00R15'
'STA.TH01E2RT00'
'PAR_STA.TH01E2RT00'}
rgx = '(?<=^STA\.)T?\w';
D = regexp(C,rgx,'match','once');
Z = C;
X = cellfun('isempty',D);
Z(~X) = cellfun(@(s){s(end)},D(~X))
Giving:
>> Z{:}
ans = H
ans = J
ans = B
ans = Deafult.TY04KT00
ans = Cond.TH09K12E3T00R15
ans = H
ans = PAR_STA.TH01E2RT00
There might be a way to get rid of the second cellfun, with the right regular expression.
  2 个评论
Bohan Liu
Bohan Liu 2018-5-3
Hey dude,I have a question regarding the option 'once'. It is pretty cool that with 'once' specified additionally, the output is cell array of character vectors, yet all matched are found. In my case(without 'once'), however, the output is nested cell array. So in essence the 'once' is not affecting the number of matches but the data type of the output? I am wondering why.
Stephen23
Stephen23 2018-5-4
编辑:Stephen23 2018-5-4
"So in essence the 'once' is not affecting the number of matches but the data type of the output? I am wondering why."
Actually the option 'once' does change how many times the regular expression is matched, exactly as the regexp documentation describes: "Match the expression as many times as possible (default), or only once.". You can test this quite easily:
>> C = regexp('ab12cd34ed','\d+','match','once')
C = 12
>> C = regexp('ab12cd34ed','\d+','match');
>> C{:}
ans = 12
ans = 34
The default option is 'all': because multiple matches are possible (but not necessary) with this option all outputs are nested into cell arrays. The exact nesting of the output also depends on whether the input was a string array, a char vector or a cell array, so to know what to expect you should read the documentation.

请先登录,再进行评论。


Bohan Liu
Bohan Liu 2018-5-2
Hi,using regexp function with lookaround assertions is a feasible walkaround for loops. Funnily enough, there is a glitch in the exp, as the program should look behind any char vector that starts with STA. either with one T appended or none at all(specified by ?). I am kind of baffled why the result includes T as well.
exp='(?<=^STA\.T?)\w';
CellArrayNew=regexp(CellArray,exp,'match');
  2 个评论
Wick
Wick 2018-5-2
I think the OP wanted the entire string if there was a failed match. As slick as it is to extract all the strings in one shot, I'm not sure your bit of code would do that.
Bohan Liu
Bohan Liu 2018-5-2
Yep, right. I will try to work on this and update the answer if I come up with something new. Thanks for the info.

请先登录,再进行评论。


Jan
Jan 2018-5-2
While regexp is very powerful, strncmp is faster usually.
C = {'STA.TH01E2RT00'
'STA.J01E2RT00R99'
'STA.TB01E2RT00'
'Deafult.TY04KT00'
'Cond.TH09K12E3T00R15'
'STA.TH01E2RT00'
'PAR_STA.TH01E2RT00'};
index = strncmp(C, 'STA.T', 5);
C(index) = cellfun(@(c) c(6), C(index), 'Uniformoutput', 0);
index = strncmp(C, 'STA.', 4);
C(index) = cellfun(@(c) c(5), C(index), 'Uniformoutput', 0)
  2 个评论
Mekala balaji
Mekala balaji 2018-5-4
Sir,
It works, but its not sure 5th position or 4th position index = strncmp(C, 'STA.T', 5); index = strncmp(C, 'STA.', 4); only suirity it starts with "STA.", take the first two alphabets after STA., and if the first one is T, then take the 2nd alphabet, else the the first one
Jan
Jan 2018-5-4
I don't understand this comment. My code does:
  1. If the string starts with 'STA.' and a following 'T', the 5th character is chosen. In otehr words: When it starts with 'STA.T'.
  2. If it starts with 'STA.' otherwise, the 4th character is chosen.
This replies the example output from your original question. Do you want something else?

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Startup and Shutdown 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by