problem with regexp and split, and picking some cells

I have the following input:
>> data(1).Header
ans =
AF051909 |392-397:CAGCTG| |413-418:CAGGTG|
I needed to save them to cells as {'392-397', 'CAGCTG'; '413-418', 'CAGGTG';}
I so I used regexp to do so with the following code:
struKm(1).trueBinding = regexp(data(1).Header,'\s\||\:|\|','split');
this returns:
>> struKm(1).trueBinding
ans =
'AF051909' '392-397' 'CAGCTG' '' '413-418' 'CAGGTG' ''
as you can see there are to empty cells and I tried to find out why they are there but failed.
I also tried to ignore that and continue to picking up the cell that I need for the rest of my code which is 'CAGCTG' and 'CAGGTG'. I have this code to pick them up:
[r1,r2] = ismember(struKm(1).trueBinding,set)
it return zeros.
Can someone help with two issues please?
Regards, A.

 采纳的回答

you can maintain your code and add a line code to remove empty elements
data='AF051909 |392-397:CAGCTG| |413-418:CAGGTG|'
s=regexp(data,'\s\||\:|\|','split');
s(cellfun(@(x) isempty(x),s))=[]

更多回答(1 个)

Thank you for your reply.
I solved the issues but another is appeared.
Now,
struKm(i).seqNam = cellstr(regexp(data(i).Header, '\s\||\:|\|','split')); % determen the seqance name heads
struKm(i).seqNam(cellfun(@(x) isempty(x),struKm(i).seqNam))=[];
This code is in a FOR LOOP.
the result for this code is:
ans =
'AF051909' '392-397' 'CAGCTG' '413-418' 'CAGGTG'
some seqNams contain only one Binding site (CAGCTG). for Example:
ans =
'M13483' '445-450' 'CAACTG'
Now I want to pick the Binding sites only which are (CAGCTG, CAGGTG, CAACTG , ... etc)
I have another for loop that will do it. The code:
struSize = length(struKm);
tempcell = cell(1,1);
for m=1:struSize
if (length(struKm(m).seqNam) == 3)
resultsk.BS{m} = struKm(m).seqNam(3);
disp(m);
end
if (length(struKm(m).seqNam) == 5)
resultsk.BS{m} = cellstr(struKm(m).seqNam([3,5]));
%tempcell = struKm(m).seqNam([3,5]); resultsk.BS{m} = cellstr(tempcell);
disp(m);
end
end
and the result for this code:
>> resultsk.BS{:}
ans =
'CAGCTG' 'CAGGTG'
ans =
'CAACTG'
ans =
'CAACTG'
The problem with some cells that have two binding sites which made the cell next to cell.
I need them all in one row. still struggling with this. Can you please help?
Thank you, A

2 个评论

>AF051909 |392-397:CAGCTG| |413-418:CAGGTG|
tgccgctcagaaaaaaacgatctttggtgaacagtaggagccatctgagcggtgcgacgcattgtgctcccattccacacgctgcggcggccctCAGCTGtcatgcctggaaCAGGTGgtgtaaggcaatccctgggcagccgtgctccccgcccccccccgggccgaccttaaaggcgctgcgtgtgccctggctcctc
>M13483 |445-450:CAACTG|
ccttacatggtctgggggctccctggctgatcctctcccctgcccttggctccatgaatggcctcggcagtcctagcgggtgcgaaggggaccaaataaggcaaggtggcagaccgggccccccacccctgcccccggctgctcCAACTGaccctgtccatcagcgttctataaagcggccctcctggagccagccaccc
>M26773 |446-451:CAACTG|
cttacatggtctgggagccccctggctgatcctctaccctgcccttggctccaagaatggcctcagcggtcctagatggtgctaaggcgaccaaataaggcaaggtggcagatcaggggccccccacccctgcccccggctgctcCAACTGaccccgtccatcagagagctataaagctgcgctccaggcgactgacacc
>M86232 |447-452:CACTTG|
ctgtgctattctggtttggatgtgactcagaacacagttgaacattatttgaactcacagagcttgccattctggaagcacagccttatatgtagtgtccatgggcagtcctattatgggaaagcaacttgagagaaaaggcgggtCACTTGcttgtgcgcaggtcctggaatttgaaatatccagaggcctctacagaa
>M86233 |447-452:CACTTG|
ctgtgctattctagtttggatgtgactcaggacagagttgaacattatttgaattcacagagcttgccatgctggaagcacagccttatatgtagtgtccatgggcagtcctattatggcaaagcaacttgagagaaaaggcgggtCACTTGcttgtgcgcaggtcctggaatttgaaatatccagaggccctacagaat
>X00371 |326-331:CACCTG|
gagctgtcctgcctcgccacaatggCACCTGccctaaaatagcttcccatgtgagggctagagaaaggaaaagattagaccctccctggatgagagagagaaagtgaaggagggcaggggagggggacagcgagccattgagcgatctttgtcaagcatcccagaaggtataaaaacgcccttgggaccaggcagcctca
>X53154 |440-445:CAGCTG|
cgaaggattggtaggcttgccgtcacaggacccccgctggctgactcaggggcgcaggctcttgcgggggagctggcctcccgcccccacggccacgggccctttcctggcaggacagcgggatcttgCAGCTGtcaggggaggggatgacgggggactgatgtcaggaggggatacaaatagtgccgacggctaggggg
>X59034 |442-447:CAGCTG| |461-466:CAGGTG|
accaaacacaatgacaagcctctgactcatgatctatgtagactctcagacactttacatctagtaagagtatagcgatcatgttaagcaaggcacgtctgtggccacagaaggccccaagctttgaggctgtgggcagctCAGCTGtcatgcgggcacaCAGGTGatgtaagacaatagctgtggagtcagctggcttc

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Data Type Conversion 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by