How can i compute Amino Acid composition for my protein sequence data?
4 次查看(过去 30 天)
显示 更早的评论
How can i get/compute the amino composition for my protein sequences inorder to further use it to train my SVM classifier?
for example if, i have the following sequence as one of my sequence sample:
'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE'
0 个评论
采纳的回答
Tommy
2020-4-23
编辑:Tommy
2020-4-23
allAA = sort('ARNDCQEGHILKMFPSTWYV');
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = histc(seq, allAA);
freq = counts/numel(seq);
for aa = allAA
fprintf('%c: %d/%d (%.4f%%)\n', aa, counts(allAA==aa), numel(seq), freq(allAA==aa));
end
%{
prints:
A: 1/49 (0.0204%)
C: 0/49 (0.0000%)
D: 10/49 (0.2041%)
E: 12/49 (0.2449%)
F: 2/49 (0.0408%)
G: 1/49 (0.0204%)
H: 0/49 (0.0000%)
I: 5/49 (0.1020%)
K: 3/49 (0.0612%)
L: 4/49 (0.0816%)
M: 0/49 (0.0000%)
N: 3/49 (0.0612%)
P: 1/49 (0.0204%)
Q: 2/49 (0.0408%)
R: 0/49 (0.0000%)
S: 1/49 (0.0204%)
T: 0/49 (0.0000%)
V: 1/49 (0.0204%)
W: 0/49 (0.0000%)
Y: 3/49 (0.0612%)
%}
0 个评论
更多回答(1 个)
Tim DeFreitas
2020-4-23
If you have the Bioinformatics Toolbox, there's also the AACOUNT function:https://www.mathworks.com/help/bioinfo/ref/aacount.html
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = aacount(seq)
% Optional: plotting included
aacount(seq, 'chart', 'bar')
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Genomics and Next Generation Sequencing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!