If I have a primer with redundant bases, how do I generate all associated primer combinations

2 次查看(过去 30 天)
As an example, if I have the following primer:
primer = 'AGCTYRSWKMACGT';
And these are the options for each redundant base:
options = {'C', 'T'; ... % Y
'A', 'G'; ... % R
'C', 'G'; ... % S
'A', 'T'; ... % W
'G', 'T'; ... % K
'A', 'C';}; % M
How do I generate all primer combinations, i.e. 1) AGCTCACAGAACGT, 2) AGCTTACAGAACGT, 3) AGCTCGCAGAACGT, etc? There should be 64 primer combinations for the above example. Thanks!

采纳的回答

Tim DeFreitas
Tim DeFreitas 2023-5-2
Per your last comment, here's a longer but more robust approach that works regardless of where the ambiguous bases are in the primer sequence:
primer = 'AWCTARCTAMGT';
allPrimers = char.empty(1,0);
for b = 1:numel(primer)
base = primer(b);
switch base
case 'Y'
nextBases = 'CT';
case 'R'
nextBases = 'AG';
case 'S'
nextBases = 'CG';
case 'W'
nextBases = 'AT';
case 'K'
nextBases = 'GT';
case 'M'
nextBases = 'AC';
otherwise
nextBases = base; % Unambiguous base
end
% Extend allPrimers by the first (and possibly only) candidate base
allPrimers(:, end+1) = nextBases(1);
if numel(nextBases) > 1
% Make a copy of all current primers and change the trailing base to the
% other candidate for the ambiguous base
alternatePrimers = allPrimers;
alternatePrimers(:, end) = nextBases(2);
allPrimers = [allPrimers; alternatePrimers];
end
end
allPrimers
allPrimers = 8×12 char array
'AACTAACTAAGT' 'ATCTAACTAAGT' 'AACTAGCTAAGT' 'ATCTAGCTAAGT' 'AACTAACTACGT' 'ATCTAACTACGT' 'AACTAGCTACGT' 'ATCTAGCTACGT'
If you want to automate against a bunch of primers, I'd suggest turning the above script into a function with the primer sequence as the input.
Hope this helps,
-Tim

更多回答(1 个)

Tim DeFreitas
Tim DeFreitas 2023-5-1
Here's one way to do it:
options = ['CT' 'AG' 'CG' 'AT' 'GT' 'AC'];
% Enumerate indices into options producing valid primers
base = 1:2:11;
offsets = dec2bin(0:63) == '1';
allPrimers = cell(1,64);
for p = 1:64
allPrimers{p} = ['AGCT', options(base + offsets(p, :)), 'ACGT'];
end
This works by arranging our options string such that indexing into it with an odd number selects the base from the first set of options, and indexing with an even number selects the base from the other set of options. For instance
options([1, 3, 5, 7, 9, 11])
ans = 'CACAGA'
selects entirely from your first column, and
options([2, 4, 6, 8, 10, 12])
ans = 'TGGTTC'
selects entirely from your second column. If we then enumerate all possible ways to index into this choosing only one element from each pair, we will produce every possible primer. Because there are 2 choices, and 6 candidate bases, we can produce these offsets using dec2bin from 0 to 2^6-1.
  1 个评论
Vishwaratn Asthana
This is an interesting approach @Tim DeFreitas! One concern I have is how would I got about automating the above code? Specifically, it appears I need to manually set the non-redundant bases in the following portion of the code:
allPrimers{p} = ['AGCT', options(base + offsets(p, :)), 'ACGT'];
This works for 'AGCTYRSWKMACGT' but what if the primer sequence was 'AWCTARCTAMGT?
Again, your help is greatly appreciated!

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Genomics and Next Generation Sequencing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by