Main Content

align2cigar

Convert aligned sequences to corresponding signatures in CIGAR format

Description

Cigars = align2cigar(Alignment,Ref) converts aligned sequences represented in Alignment into Cigars using the reference sequence specified by Ref.

example

[Cigars,Starts] = align2cigar(Alignment,Ref) also returns Starts, a vector of integers indicating the start position of each aligned sequence with respect to the ungapped reference sequence.

example

Examples

collapse all

This example shows how to convert aligned strings to CIGAR strings

Create a cell array of aligned strings, create a string specifying a reference sequence, and then convert the alignment to CIGAR strings:

aln = ['ACG-ATGC'; 'ACGT-TGC'; '  GTAT-C']
aln = 3x8 char array
    'ACG-ATGC'
    'ACGT-TGC'
    '  GTAT-C'

ref =  'ACGTATGC';
[cigar, start] = align2cigar(aln, ref)
cigar = 1x3 cell
    {'3=1D4='}    {'4=1D3='}    {'4=1D1='}

start = 1×3

     1     1     3

Input Arguments

collapse all

Aligned sequence, specified as a cell array of aligned character vectors, a string vector, or a character array. Soft clippings are assumed to be represented by lowercase letters in the aligned sequences. Skipped positions are assumed to be represented by a period . in the aligned sequences.

Data Types: char | string | cell

Aligned reference sequence, specified as a character vector or string. The length of Ref must equal the number of columns in Alignment.

Data Types: char | string

Output Arguments

collapse all

Converted sequence in CIGAR format, returned as a cell array of character vectors. Each entry in Cigars corresponds to one entry in Alignment.

Start position of each aligned sequence, returned as an integer vector. The start positions are with respect to the ungapped reference sequence specified by Ref.

Version History

Introduced in R2010b