Sequence Distance

3 次查看(过去 30 天)
Talha
Talha 2011-7-20
I am sort of confused on how matlab gets its answers for various distance methods. My boss wants to know how matlab arrives at various answers.
I set up matlab to give me answers in fractions, so when I analyze two sequences of the same length, it gives me the denominator of the fraction to be length of the sequences (for example, if both amino acid sequences have a length of 327 then the answer has a denominator of 327). I understood this until when I analyzed two amino acid sequences with each having a different length, one being 369 amino acids long, and another being 379 amino acids long. It gave me the answer: 209/398. I don't understand how it got to having a denominator of 398 (I specifically asked it to use p-distance). When I type in "help seqpdist", it does not give me very clear explanation on how the p-distance works.
So can some one please help me out? I would greatly appreciate it!

回答(1 个)

Lucio Cetto
Lucio Cetto 2011-7-20
When you are comparing sequences it is common to first align them using a dynamic programing algorithm. SEQPDIST uses NWALIGN to pair-wise align all possible pairs of sequences and then takes the measure from the alignment.
Consider:
seqpdist({'AACGT','AAGT','AAT'},'alpha','nt','square',1,'method','p-dist')
The alignment between 1 and 2 is 'AACGT' and 'AA-GT' =>1/5
The alignment between 1 and 3 is 'AACGT' and 'AA--T' =>2/5
The alignment between 1 and 3 is 'AAGT' and 'AA-T' =>1/4
HTH

类别

Help CenterFile Exchange 中查找有关 Data Import and Export 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by