basecount
Count nucleotides in sequence
Description
Examples
Count the bases in a DNA sequence and return the results in a structure.
bases = basecount('TAGCTGGCCAAGCGAGCTTG')
bases = struct with fields:
A: 4
C: 5
G: 7
T: 4
Get the number of adenosine (A) bases.
bases.A
ans = 4
Create a bar graph comparing the number of each nucleotide.
basecount('TAGCTGGCCAAGCGAGCTTG',Chart="bar")
ans = struct with fields:
A: 4
C: 5
G: 7
T: 4
Count the bases in a DNA sequence containing ambiguous characters (R, Y, K, M, S, W, B, D, H, V, or N), listing each of them in a separate field.
basecount('ABCDGGCCAAGCGAGCTTG',Ambiguous="individual")
ans = struct with fields:
A: 4
C: 5
G: 6
T: 2
R: 0
Y: 0
K: 0
M: 0
S: 0
W: 0
B: 1
D: 1
H: 0
V: 0
N: 0
Input Arguments
Nucleotide sequence, specified as one of the following.
Character vector or string scalar consisting of the characters
A
,C
,G
,T
, andU
, and ambiguous charactersR
,Y
,K
,M
,S
,W
,B
,D
,H
,V
, andN
.Row vector of integers specifying a nucleotide sequence. For information on valid integers, see Mapping Nucleotide Integers to Letter Codes.
Structure that contains a nucleotide sequence in the
Sequence
field. Thefastaread
,fastqread
,emblread
,getembl
,genbankread
, andgetgenbank
functions return structures with aSequence
field.
Example: NTStruct = basecount('CGACTT')
counts the number of times
of each nucleotide occurs in the sequence.
Data Types: double
| char
| string
| struct
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: NTStruct =
basecount("ACGGTC",Ambiguous="individual")
Method for counting ambiguous nucleotide characters (R
,
Y
, K
, M
,
S
, W
, B
,
D
, H
, V
, and
N
), specified as one of the following.
"ignore"
—basecount
skips ambiguous characters."bundle"
—basecount
counts ambiguous characters and reports the total count in theAmbiguous
field."prorate"
—basecount
counts ambiguous characters and distributes the total number evenly between all possible unambiguous nucleotide fields. For example, the count for the characterR
is distributed evenly between theA
andG
fields."individual"
—basecount
counts ambiguous characters and reports them in individual fields."warn"
—basecount
skips ambiguous characters and displays a warning.
Example: NTStruct = basecount("CGRTTMSA",Ambiguous="bundle")
reports the total number of ambiguous characters in the Ambiguous
field of NTStruct
.
Data Types: char
| string
Flag to count or ignore gaps, specified as true
or
false
. Gaps are indicated by a hyphen
(-
).
If you set this option to true
, then
basecount
counts the gaps and reports the total count in the
Gaps
field.
Data Types: logical
Type of chart to display the proportions of nucleotides, specified as
"pie"
or "bar"
.
Data Types: char
| string
Output Arguments
Nucleotide counts, returned as a structure containing the fields
A
, C
, G
, and
T
. Uracil nucleotides (U
) are added to the
T
field. Additional fields can be present, depending on the value
of Ambiguous
and Gaps
.
Version History
Introduced before R2006a
See Also
aacount
| baselookup
| codoncount
| cpgisland
| dimercount
| nmercount
| ntdensity
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
选择网站
选择网站以获取翻译的可用内容,以及查看当地活动和优惠。根据您的位置,我们建议您选择:。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)