bioinfo.blastplus.BLASTPOptions
Description
Creation
Syntax
Description
creates a optionsObj
= bioinfo.blastplus.BLASTPOptionsBLASTPOptions
object with default property values.
Alternatively, you can use the blastplusoptions
function to create the object.
sets the object properties using one or more name-value arguments. optionsObj
= bioinfo.blastplus.BLASTPOptions(Name=Value)Name
is the property name and Value
is the property value. For example, set
ExpectValue=0.01
to use the expect value of 0.01.
specifies optional parameters using a string or character vector optionsObj
= bioinfo.blastplus.BLASTPOptions(S
)S
.
S
must be in the native syntax
(prefixed by one dash). For example, optionsObj =
bioinfo.blastplus.BLASTPOptions("-dbsize 50")
sets the effective database size
to 50.
Properties
DatabaseSize
— Effective database size
bioinfo.blastplus.Default
(default) | nonnegative integer
Effective database size, specified as a nonnegative integer. The default value is a bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
Data Types: double
ExpectValue
— Expect value for saving hits
10
(default) | positive scalar
Expect value for saving hits, specified as a positive scalar.
This value describes the expected number of hits you might get when searching a database. The lower the expect value, the more significant the match is. You could use this value to create a significance threshold for reporting results. For details, see this FAQ page.
Data Types: double
ExtraCommand
— Additional commands
""
(default) | character vector | string scalar
Additional commands, specified as a character vector or string scalar.
The commands must be in the native syntax (prefixed by one dash). Use this option to apply undocumented flags and flags without corresponding MATLAB® properties.
Example: "-lcase_masking"
Data Types: char
| string
GapExtendPenalty
— Cost to extend gap
bioinfo.blastplus.Default
(default) | nonnegative integer
Cost to extend a gap, specified as a nonnegative integer. The default value is a
bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
Data Types: double
GapOpenPenalty
— Cost to open gap
bioinfo.blastplus.Default
(default) | nonnegative integer
Cost to open a gap, specified as a nonnegative integer. The default value is a bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
Data Types: double
GappedAlignment
— Flag to perform gapped alignment
true
or 1 (default) | false
or 0
Flag to perform a gapped alignment, specified as a numeric or logical 1
(true
) or 0 (false
). To perform an ungapped
alignment, set GappedAlignment=false
.
Data Types: double
| logical
IncludeAll
— Flag to include all object properties
false
or 0 (default) | true
or 1
Flag to include all object properties with their
corresponding default values when converting to the original option syntax, specified as
a numeric or logical 1 (true
) or 0 (false
). You
can convert properties to the original syntax prefixed by a dash (such as
-dbtype nucl
) by using the getCommand
function.
When IncludeAll=false
and you call
getCommand(optionsObject)
, the software converts only the
specified properties. If the value is true
,
getCommand
converts all available properties, using default
values for unspecified properties, to the original syntax.
Note
If you set IncludeAll
to true
, the
software translates all available properties, with default values for
unspecified properties. The only exception is that when the default value of a
property is NaN
, Inf
,
[]
, ''
, or ""
, then
the software does not translate the corresponding property.
Example: true
Data Types: logical
LineLength
— Line length for formatting alignments in report
80 (default) | positive integer
Line length for formatting alignments in the report containing the search results, specified as a positive integer.
This option is not applicable for ReportFormat
> 4.
Data Types: double
MaxHighScoringPairs
— Maximum number of high-scoring segment pairs
bioinfo.blastplus.Default
(default) | positive integer
Maximum number of high-scoring segment pairs, specified as a positive integer. The default
value is a bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
Data Types: double
MaxTargetSequences
— Maximum number of aligned sequences to keep
500 (default) | positive integer
Maximum number of aligned sequences to keep, specified as a positive integer.
Data Types: double
NumAlignments
— Number of sequences to show alignments for
250 (default) | nonnegative integer
Number of database sequences to show alignments for, specified as a nonnegative integer.
Data Types: double
NumDescriptions
— Number of sequences to show one-line description for
500 (default) | nonnegative integer
Number of database sequences to show a one-line description for, specified as a nonnegative integer.
Data Types: double
NumThreads
— Number of parallel threads
1 (default) | positive integer
Number of parallel threads to use, specified as a positive integer. The software runs threads on separate processors or cores. Increasing the number of threads generally improves the runtime significantly, but also increases the memory footprint.
Data Types: double
QueryLocation
— Location on query sequence
[1 Inf]
(default) | two-element vector of positive integers
Location on the query sequence where you want the BLAST search to focus, specified as a two-element vector of positive integers. The first element must be smaller than the second.
For example, if a query protein sequence is 200 amino acids long, and you are interested in
the region from amino acid 50 to 100, set the value to
[50 100]
.
Data Types: double
ReportFormat
— Format of BLAST report
"Pairwise"
or 0 (default) | nonnegative integer between 0 and 18 | string scalar | character vector
Format of the BLAST report, specified as one of the following.
Format | Corresponding Number | Description |
---|---|---|
"Pairwise" | 0 | Traditional BLAST pairwise format. This format presents each query-subject pair alignment in detail, including alignment scores, e-values, and sequence alignments. |
"QueryAnchored" | 1 | Query-anchored format showing identities, that is, matching bases or amino acids. This format is more compact than the default and emphasizes the identical matches. |
"QueryAnchoredNoIdentities" | 2 | Query-anchored format with no identities. In this format, the query
sequence is fixed and the database hit sequences are aligned to it
without showing the identities, that is, matching bases or amino acids.
This format is less detailed than "QueryAnchored" but
is also less cluttered. |
"FlatQuery" | 3 | Flat query-anchored format showing identities. Although this format
is similar to "QueryAnchored" , the alignments might
be condensed to save space and reduce redundancy. |
"FlatQueryNoIdentities" | 4 | Flat query-anchored format with no identities. Although this format
is similar to "QueryAnchoredNoIdentities" , the
alignments might be condensed to save space and reduce
redundancy. |
"BLASTXML" | 5 | XML BLAST output format. The XML report contains information about the query sequences, database hits, alignments, scores, and statistical significance. |
"Tabular" | 6 | Tabular format. This format is a tab-delimited format that provides a concise summary. The default columns in the tabular output are as follows, in this order:
|
"TabularCommented" | 7 | Tabular format with comment lines. This format is the same as
"Tabular" with the addition of comment lines that
start with a hash # sign. Comment lines include
metadata, such as the BLAST version, reference, database name, query
ID,and names of columns included in the report. |
"SeqalignText" | 8 | Text ASN.1 format. NCBI uses the Abstract Syntax Notation One data representation format for the storage and retrieval of data, such as nucleotide and protein sequences. For details, see Protein Domains and Macromolecular Structures. |
"SeqalignBinary" | 9 | Binary ASN.1 format. NCBI uses the Abstract Syntax Notation One data representation format for the storage and retrieval of data, such as nucleotide and protein sequences. For details, see Protein Domains and Macromolecular Structures. |
"CommaSeparated" | 10 | Comma-separated values (CSV) format. This format is the same as the
"Tabular" format except it uses commas to
separate values. |
"BLASTArchive" | 11 | BLAST archive format (ASN.1). This format is a compact and complete record of the search. The format is useful for saving the results of a BLAST search for later reanalysis or for use with other NCBI tools without having to rerun the search. |
"SeqalignJSON" | 12 | Seqalign (JSON) format. This format provides easy readability for many tools. For details, see BLAST database metadata. |
"MultiBLASTJSON" | 13 | Multiple-file BLAST JSON format. For this format, the BLAST search generates multiple JSON files with the search results. One file contains a list of all the generated JSON files. For each query sequence, the search returns one JSON file specific to that query sequence, even when the search contains no hits for the query. |
"MultiBLASTXML2" | 14 | Multiple-file BLAST XML2 format. For this format, the BLAST search generates multiple XML files with the search results. One file contains a list of all the generated XML files. For each query sequence, the search returns one XML file specific to that query sequence, even when the search contains no hits for the query. |
"SingleBLASTJSON" | 15 | Single-file BLAST JSON format. This format returns a single JSON file with all the search results. |
"SingleBLASTXML2" | 16 | Single-file BLAST XML2 format. This format returns a single XML file with all the search results. |
"SAM" | 17 | Sequence alignment/map (SAM) format. For details about this format, see Sequence Alignment/Map Format Specification. |
"OrganismReport" | 18 | Organism report format. This report is a BLAST taxonomy report that sorts the hits according to the species of the target sequence, so that all the hits to the same organism appear together. For details, see Taxonomy BLAST Help. |
Data Types: double
| char
| string
ScoringMatrix
— Scoring matrix name
"BLOSUM62"
(default) | "BLOSUM90"
| ...
Scoring matrix name, specified as one of the following: "BLOSUM90"
, "BLOSUM80"
, "BLOSUM62"
, "BLOSUM50"
, "BLOSUM45"
, "PAM250"
, "PAM70"
, "PAM30"
, or "IDENTITY"
.
Tip
To generate these matrices in MATLAB, use the blosum
function.
Data Types: char
| string
SearchSpaceLength
— Effective length of search space
bioinfo.blastplus.Default
(default) | nonnegative integer
Effective length of the search space, specified as a nonnegative integer. The default value is
a bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
The search space is the theoretical size of all possible alignments between the query sequence and the database sequences. The search space is a parameter used in the calculation of the statistical significance (E-values) of the BLAST hits.
Data Types: double
Task
— Task name
"blastp"
(default) | "blastp-fast"
| "blastp-short"
Task name, specified as one of the following:
"blastp"
– Traditional BLASTP to compare a protein query to a protein database"blastp-fast"
– Faster version that uses a larger word-size per [3]."blastp-short"
– BLASTP optimized for queries shorter than 30 residues
For details, see here.
Data Types: char
| string
Version
— Supported version
string scalar
This property is read-only.
Supported version of the original BLAST+
software, specified as a string
scalar.
Data Types: string
WindowSize
— Multiple hits window size
bioinfo.blastplus.Default
(default) | nonnegative integer
Multiple hits window size, specified as a nonnegative integer. The default value is a bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
A larger window size increases the sensitivity of the search to detect more divergent sequences, but might also increase the noise in the search results.
A smaller window size decreases the search sensitivity, which might cause missed alignments in sequences with larger gaps or more divergent regions. However, a smaller window size might decrease the noise and make the significant alignments more apparent.
Data Types: double
WordSize
— Word size for initial match
bioinfo.blastplus.Default
(default) | positive integer
Word size for an initial match, specified as a positive integer. The default value is a
bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
A larger word size decreases the search sensitivity because BLAST is less likely to find longer exact matches that are not highly conserved. However, a larger word size might speed up the search and reduce noise in the search results.
A smaller word size increases the search sensitivity because BLAST is more likely to detect alignments, including those with more distant or weak similarities. The search might become slower due to the increased number of initial matches. Also, the search noise might increase.
Data Types: double
WordThreshold
— Minimum score required to add word to BLAST lookup table
bioinfo.blastplus.Default
(default) | nonnegative scalar
Minimum score required to add a word to the BLAST lookup table, specified as a
nonnegative scalar. The default value is a bioinfo.blastplus.Default
object, which means that the corresponding BLAST
task or query program sets the default value.
Data Types: double
Object Functions
getCommand | Translate object properties to original options syntax |
getOptionsTable | Return table with all properties and equivalent options in original syntax |
reset | Reset BLAST database options to default values |
Examples
Create and Search Local BLAST+ Database
Download some paired-end sequencing data in the FASTA format using the accession run number SRR26273031
.
databaseFasta = srafasterqdump("SRR26273031",FastaOutput=true)
Create a local nucleotide database using the downloaded FASTA file. Specify "SRR26273031_nucl_db"
as the base name of the output database. When creating the database, the function also generates multiple index files with the same base name. The blastplus
function uses these index files automatically when you search the database later in this example.
blastplusdatabase("nucleotide","SRR26273031.fasta","SRR26273031_nucl_db");
You can also specify additional database creation options using a MakeDatabaseOptions
object. For instance, specify the title of the database.
dbopts = bioinfo.blastplus.MakeDatabaseOptions;
dbopts.Title = "SRR26273031_Nucleotide_DB"
dbopts = MakeDatabaseOptions with properties: Default properties: ExtraCommand: "" IncludeAll: 0 InputType: "fasta" ParseSequenceIDs: 0 Version: "2.14.0" Modified properties: Title: "SRR26273031_Nucleotide_DB"
You can then use the options object to make the database.
blastplusdatabase("nucleotide","SRR26273031.fasta","SRR26273031_nucl_db",dbopts);
Alternatively, you can use specify options, such as the title of the database, by using name-value arguments. For example:
blastplusdatabase("nucleotide","SRR26273031.fasta","SRR26273031_nucl_db",Title="SRR26273031_Nucleotide_DB");
To reset the property values to their default values, use the reset
function.
dopts2 = reset(dbopts)
dopts2 = MakeDatabaseOptions with properties: Default properties: ExtraCommand: "" IncludeAll: 0 InputType: "fasta" ParseSequenceIDs: 0 Title: [1×0 string] Version: "2.14.0" Modified properties: No properties.
Search the database using the FASTA file queryFile.fasta
containing two nucleotide query sequences. This file is provided with the toolbox. Use the blastn
query program which lets you search nucleotide queries against a nucleotide database. Specify "search1"
as the name of the output report file. By default, the report file format is the traditional BLAST pairwise format. This format presents each query-subject pair alignment in detail.
blastplus("blastn","queryFile.fasta","SRR26273031_nucl_db","search1");
Open the file to review the search results. The first query sequence returns no hits, while the second query sequence returns multiple hits.
open search1;
You can also modify search options by creating a corresponding options object for the blastn
query program. Use blastplusoptions
or bioinfo.blastplus.*Options
to create the options object. For instance, change the report format to an XML format.
bnopts = blastplusoptions("blastn"); % Or use bioinfo.blastplus.BLASTNOptions bnopts.ReportFormat = "BLASTXML"; blastplus("blastn","queryFile.fasta","SRR26273031_nucl_db","search2_xml",bnopts); open search2_xml;
Alternatively, you can set the value of a property of the options object, such as ReportFormat
, using name-value argument syntax. For example:
blastplus("blastn","queryFile.fasta","SRR26273031_nucl_db","search2_xml",ReportFormat="BLASTXML");
You can use other query programs to search the database. For instance, use tblastx
to search translated nucleotide queries against a translated nucleotide database. Both query sequences return hits for this search. Use the compact tabular format for the report. For details about the generated columns and other report formats, see ReportFormat.
blastplus("tblastx","queryFile.fasta","SRR26273031_nucl_db","search3_tab",ReportFormat="Tabular"); open search3_tab;
Delete the reports and downloaded FASTA file.
delete search1 search2_xml search3_tab SRR26273031.fasta
References
[1] Camacho, Christiam, George Coulouris, Vahram Avagyan, Ning Ma, Jason Papadopoulos, Kevin Bealer, and Thomas L Madden. “BLAST+: Architecture and Applications.” BMC Bioinformatics 10, no. 1 (December 2009): 421.
[2] “BLAST: Basic Local Alignment Search Tool.” https://blast.ncbi.nlm.nih.gov/Blast.cgi.
[3] Shiryev, Sergey A., Jason S. Papadopoulos, Alejandro A. Schäffer, and Richa Agarwala. “Improved BLAST Searches Using Longer Words for Protein Seeding.” Bioinformatics 23, no. 21 (November 1, 2007): 2949–51.
Version History
Introduced in R2024a
MATLAB 命令
您点击的链接对应于以下 MATLAB 命令:
请在 MATLAB 命令行窗口中直接输入以执行命令。Web 浏览器不支持 MATLAB 命令。
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)