How to read first 5 characters of string

4 次查看(过去 30 天)
I have the following Table, and I want to perform the following:
My input table:
Production.K02L260E4Y00 5
Production.K07L890E4Y00 8
Production.K09L780E4Y00U09 12
Heating.Cr_Cleaning 34
Heating.Cr_67kev_Cleaning 23
Top_OFF.IT4_Production.K57LUY00 56
Production.K09L180E9Y00U09 11
Top_OFF.IT4_Production.K57LUY01_Try4 5
DummyTest.G078K342E5T6 8
1. If a row (in my table first column) start with "Production." then remove "Production.", and store the first 4 characters from the remaining string
2. If the row does not start with "Production." then just store as it is
for example:
In first row of first column is "Production.K02L260E4Y00(it start with "Production.") It will first modify as: K02L260E4Y00 (by removing Production.), and then store the output of the first column as "K02L". But row4 is "Heating.Cr_Cleaning", it does not start with "Production.", so just keep it as such in the output.
My output should be as below:
K02L 5
K07L 8
K09L 12
Heating.Cr_Cleaning 34
Heating.Cr_67kev_Cleaning 23
Top_OFF.IT4_K57LUY00 56
K09L 11
Top_OFF.IT4_K57LUY01_Try4 5
DummyTest.G078K342E5T6 8
Many thanks,

回答(2 个)

Sebastian Castro
Sebastian Castro 2016-3-12
编辑:Sebastian Castro 2016-3-12
I would do that with regexprep (REGular EXPression REPlace), which is not the easiest thing to pick up. Definitely look at the documentation for that function if you're not familiar.
My command for a single entry would be the following:
>> regexprep('Production.K07L890E4Y00','(Production.)(\w{4})(\w*)','$2')
ans = K07L
What this is saying is I'm partitioning my search text into three expressions:
  • (Production.) -- literally the text "Production."
  • (\w{4}) -- four alphanumeric/underscore characters, i.e., the ones I want to keep
  • (\w*) -- the remaining alphanumeric/underscore characters, i.e., the ones I can throw away
By choosing $2 as the third argument, I'm saying replace all the strings that match my above criteria with the second expression, which is those four alphanumeric characters after the period (\w{4}).
Doing this on a table variable of character arrays would then be:
>> regexprep(myTable.VarName1,'(Production.)(\w{4})(\w*)','$2')
ans =
'K02L'
'K07L'
'K09L'
'Heating.Cr_Cleaning'
'Heating.Cr_67kev_Cleaning'
'Top_OFF.IT4_K57L'
'K09L'
'Top_OFF.IT4_K57L'
'DummyTest.G078K342E5T6'
Now the problem with this logic is that Top_OFF.IT4_Production.K57LUY01_Try4 became Top_OFF.IT4_K57L and didn't retain the _Try4 at the end.
You can get around this with more complex logic that stops the replacement at any underscores which may be there. I will defer to the documentation to explain how this one works :)
>> regexprep(myTable.VarName1,'(Production.)(\w{4})([a-zA-Z0-9]*)(\_?)','$2$4')
ans =
'K02L'
'K07L'
'K09L'
'Heating.Cr_Cleaning'
'Heating.Cr_67kev_Cleaning'
'Top_OFF.IT4_K57L'
'K09L'
'Top_OFF.IT4_K57L_Try4'
'DummyTest.G078K342E5T6'
- Sebastian
  2 个评论
Mekala balaji
Mekala balaji 2016-3-13
Sir,
It works, But I just only want treat the which start with "Production.", But I do not care if the 'Production.' appears anywhere. Like in "Top_OFF.IT4_Production.K57LUY01_Try4", it contains "Production.", but it The "Production." is not at the beginning, so I don't take any action on this. Just retain as such (Top_OFF.IT4_Production.K57LUY01_Try4). Please help.
Sebastian Castro
Sebastian Castro 2016-3-14
Yes, you can add a "beginning of word" operator ( \< ) in front of your expression so it's looking only at words that start with "Production".
>> regexprep(myTable.VarName1,'\<(Production.)(\w{4})(\w*)','$2')
Again, this is all in the documentation, so please give it a look if you want to customize the logic.
- Sebastian

请先登录,再进行评论。


KSSV
KSSV 2016-3-12
clc; clear all
data = importdata('5CodeInput.xlsx') ;
data1 = data.textdata.Sheet1 ;
data2 = data.data.Sheet1 ;
iwant = [] ;
str = 'Production' ;
for i = 1:length(data1)
temp = data1{i} ;
str_find = strfind(temp,str) ;
if str_find == 1
iwant = [iwant ; {temp((length(str)+2):(length(str)+5))}] ;
else
iwant = [iwant ; {temp}] ;
end
end
for i = 1:length(data1)
fprintf('%s : %s\n',iwant{i},num2str(data2(i))) ;
end
Hope the above logic helps you. You may further refine it accordingly.
  1 个评论
Mekala balaji
Mekala balaji 2016-3-19
Sir/Madam, The above codes helps me and it works. But I face one more issue. The input is like this:
Production.K02L260E4Y00 5
Production.K07L890E4Y00 8
Production.K09L780E4Y00U09 12
Production.ZA09L780E4Y0U09 34
Heating.Cr_Cleaning 34
Heating.Cr_67kev_Cleaning 23
Top_OFF.IT4_Production.K57LUY00 56
Production.K09L180E9Y00U09 11
Top_OFF.IT4_Production.K57LUY01_Try4 5
DummyTest.G078K342E5T6 8
Production.ZF09L780E4Y0U09 4
Production.VA09L780E4Y0U09 4
After breaking at "Production.", first two codes are some time alphabets, and some other time one is alphabet, one is numeric. Now, I want to keep if first two are alphabets, then keep. If first one is alphabet, and second one is numeric, just keep the first one (i.e, only alphabet, ignore numeric).
K0
K2
K1
ZA
Heating.Cr_Cleaning
Heating.Cr_67kev_Cleaning
Top_OFF.IT4_Production.K57LUY00
K4
Top_OFF.IT4_Production.K57LUY01_Try4
DummyTest.G078K342E5T6
ZF
VA
But I want my output as below:
K
K
K
ZA
Heating.Cr_Cleaning
Heating.Cr_67kev_Cleaning
Top_OFF.IT4_Production.K57LUY00
K
Top_OFF.IT4_Production.K57LUY01_Try4
DummyTest.G078K342E5T6
ZF
VA
Please help, How can I do this. Many thanks in advance,

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Get Started with MATLAB 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by