How to read first 5 characters of string

Question

Mekala balaji 2016-3-12

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/272973-how-to-read-first-5-characters-of-string

评论： Mekala balaji 2016-3-19

5CodeInput.xlsx

在 MATLAB Online 中打开

I have the following Table, and I want to perform the following:

My input table:

Production.K02L260E4Y00                 5
Production.K07L890E4Y00                 8
Production.K09L780E4Y00U09         12
Heating.Cr_Cleaning                 34
Heating.Cr_67kev_Cleaning         23
Top_OFF.IT4_Production.K57LUY00         56
Production.K09L180E9Y00U09         11
Top_OFF.IT4_Production.K57LUY01_Try4  5
DummyTest.G078K342E5T6                  8

1. If a row (in my table first column) start with "Production." then remove "Production.", and store the first 4 characters from the remaining string

2. If the row does not start with "Production." then just store as it is

for example:

In first row of first column is "Production.K02L260E4Y00(it start with "Production.") It will first modify as: K02L260E4Y00 (by removing Production.), and then store the output of the first column as "K02L". But row4 is "Heating.Cr_Cleaning", it does not start with "Production.", so just keep it as such in the output.

My output should be as below:

K02L                            5
K07L                            8
K09L                            12
Heating.Cr_Cleaning            34
Heating.Cr_67kev_Cleaning    23
Top_OFF.IT4_K57LUY00            56
K09L                            11
Top_OFF.IT4_K57LUY01_Try4    5
DummyTest.G078K342E5T6            8

Many thanks,

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Sebastian Castro 2016-3-12

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/272973-how-to-read-first-5-characters-of-string#answer_213312

编辑：Sebastian Castro 2016-3-12

在 MATLAB Online 中打开

I would do that with regexprep (REGular EXPression REPlace), which is not the easiest thing to pick up. Definitely look at the documentation for that function if you're not familiar.

My command for a single entry would be the following:

>> regexprep('Production.K07L890E4Y00','(Production.)(\w{4})(\w*)','$2')
ans = K07L

What this is saying is I'm partitioning my search text into three expressions:

(Production.) -- literally the text "Production."
(\w{4}) -- four alphanumeric/underscore characters, i.e., the ones I want to keep
(\w*) -- the remaining alphanumeric/underscore characters, i.e., the ones I can throw away

By choosing $2 as the third argument, I'm saying replace all the strings that match my above criteria with the second expression, which is those four alphanumeric characters after the period (\w{4}).

Doing this on a table variable of character arrays would then be:

>> regexprep(myTable.VarName1,'(Production.)(\w{4})(\w*)','$2')
ans = 
      'K02L'
      'K07L'
      'K09L'
      'Heating.Cr_Cleaning'
      'Heating.Cr_67kev_Cleaning'
      'Top_OFF.IT4_K57L'
      'K09L'
      'Top_OFF.IT4_K57L'
      'DummyTest.G078K342E5T6'

Now the problem with this logic is that Top_OFF.IT4_Production.K57LUY01_Try4 became Top_OFF.IT4_K57L and didn't retain the _Try4 at the end.

You can get around this with more complex logic that stops the replacement at any underscores which may be there. I will defer to the documentation to explain how this one works :)

>> regexprep(myTable.VarName1,'(Production.)(\w{4})([a-zA-Z0-9]*)(\_?)','$2$4')
ans = 
      'K02L'
      'K07L'
      'K09L'
      'Heating.Cr_Cleaning'
      'Heating.Cr_67kev_Cleaning'
      'Top_OFF.IT4_K57L'
      'K09L'
      'Top_OFF.IT4_K57L_Try4'
      'DummyTest.G078K342E5T6'

- Sebastian

2 个评论
显示无隐藏无

Mekala balaji 2016-3-13

Sir,

It works, But I just only want treat the which start with "Production.", But I do not care if the 'Production.' appears anywhere. Like in "Top_OFF.IT4_Production.K57LUY01_Try4", it contains "Production.", but it The "Production." is not at the beginning, so I don't take any action on this. Just retain as such (Top_OFF.IT4_Production.K57LUY01_Try4). Please help.

Sebastian Castro 2016-3-14

在 MATLAB Online 中打开

Yes, you can add a "beginning of word" operator ( \< ) in front of your expression so it's looking only at words that start with "Production".

>> regexprep(myTable.VarName1,'\<(Production.)(\w{4})(\w*)','$2')

Again, this is all in the documentation, so please give it a look if you want to customize the logic.

- Sebastian

请先登录，再进行评论。

Answer 2

KSSV 2016-3-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/272973-how-to-read-first-5-characters-of-string#answer_213311

在 MATLAB Online 中打开

clc; clear all
data = importdata('5CodeInput.xlsx') ;
data1 = data.textdata.Sheet1 ;
data2 = data.data.Sheet1 ;
iwant = [] ;
str = 'Production' ;
for i = 1:length(data1)
    temp = data1{i} ;
    str_find = strfind(temp,str) ;
    if str_find == 1
        iwant =  [iwant ; {temp((length(str)+2):(length(str)+5))}] ;
    else 
        iwant = [iwant ; {temp}] ;
    end
end
for i = 1:length(data1)
    fprintf('%s : %s\n',iwant{i},num2str(data2(i))) ;
end

Hope the above logic helps you. You may further refine it accordingly.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Mekala balaji 2016-3-19

在 MATLAB Online 中打开

Sir/Madam, The above codes helps me and it works. But I face one more issue. The input is like this:

Production.K02L260E4Y00                 5
Production.K07L890E4Y00                 8
Production.K09L780E4Y00U09         12
Production.ZA09L780E4Y0U09         34
Heating.Cr_Cleaning                 34
Heating.Cr_67kev_Cleaning         23
Top_OFF.IT4_Production.K57LUY00         56
Production.K09L180E9Y00U09         11
Top_OFF.IT4_Production.K57LUY01_Try4   5
DummyTest.G078K342E5T6                 8
Production.ZF09L780E4Y0U09         4
Production.VA09L780E4Y0U09         4

After breaking at "Production.", first two codes are some time alphabets, and some other time one is alphabet, one is numeric. Now, I want to keep if first two are alphabets, then keep. If first one is alphabet, and second one is numeric, just keep the first one (i.e, only alphabet, ignore numeric).

K0
K2
K1
ZA
Heating.Cr_Cleaning
Heating.Cr_67kev_Cleaning
Top_OFF.IT4_Production.K57LUY00
K4
Top_OFF.IT4_Production.K57LUY01_Try4
DummyTest.G078K342E5T6
ZF
VA

But I want my output as below:

K
K
K
ZA
Heating.Cr_Cleaning
Heating.Cr_67kev_Cleaning
Top_OFF.IT4_Production.K57LUY00
K
Top_OFF.IT4_Production.K57LUY01_Try4
DummyTest.G078K342E5T6
ZF
VA