How do I make a cell with the following contents?
1 次查看(过去 30 天)
显示 更早的评论
I am making a program that basically takes a string s as a single strand of DNA and returns the amino acid sequence of the longest gene it finds. Whereby, a gene is defined as a reading frame that: starts with AUG codon, ends with one of UAA,UAG, or UGA codon.
I tried making a cell of different "frames" but since they are not the same length I can't put them into an array. How do i work around this? Here's my code:
function [ptn]=Seq_transcribe2(x)
y=seq_transcribe1(x);
frames={};
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end)
y(3:end)};
starts=[];
stops=[];
allorfs={};
for i=1:3:numel(frames)-2
codon= frames([i i+1 i+2])
if codon=='AUG'
starts(end+1)=codon;
if strcmp(codon,'UAA') || strcmp(codon,'UAG') || strcmp(codon,'UGA')
stops(end+1)=codon;
end
stops= find(stops>starts,1)
lengthofthisstart=stops-starts
allorfs{end+1}=frame(starts:stops-1)
2 个评论
采纳的回答
Guillaume
2017-1-15
If I understood correctly, a simple way to find all genes would be:
[genesequences, starts, stops] = regexp(x, 'AUG.*?(UAA|UAG|UGA)', 'match', 'start', 'end');
And the longest sequence is of course:
[~, longestidx] = max(stops - starts);
longestsequence = genesequences{longestidx}
2 个评论
Arthur Goldsipe
2017-1-16
I think you need a slight change to account for the fact that all codons are 3 characters long:
[genesequences, starts, stops] = regexp(x, 'AUG(...)*?(UAA|UAG|UGA)', 'match', 'start', 'end');
Guillaume
2017-1-16
Oh yes, as I know nothing about genes and codons, I didn't know that the number of characters between the start and end codon must be a multiple of three, but I should have inferred that from the original code.
Thanks.
更多回答(1 个)
Niels
2017-1-15
if i understood you right your problem is in one of the following lines:
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end)
allorfs{end+1}=frame(starts:stops-1)
if so, i cant replicate your problem, in cell arrays the length of the elements is irrelevant
>> a=1:3;
>> b=1:4;
>> c=1:5;
>> cell={a b c}
cell =
1×3 cell array
[1×3 double] [1×4 double] [1×5 double]
%=================================
>> a={}
a =
0×0 empty cell array
>> a{end+1}=1
a =
cell
[1]
>> a{end+1}=2
a =
1×2 cell array
[1] [2]
>> a{end+1}=[2 1]
a =
1×3 cell array
[1] [2] [1×2 double]
2 个评论
Guillaume
2017-1-15
If the following line
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end)
y(3:end)};
is indeed written on two lines, then yes matlab is going to issue a concatenation error since the line return is interpreted as a vertical concatenation.
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end) y(3:end)};
or
frames={x(1:end) x(2:end) x(3:end) y(1:end) y(2:end) ...
y(3:end)};
would fix the error
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Genomics and Next Generation Sequencing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!