SRASAMDumpOptions
Description
An SRASAMDumpOptions
object contains options for the srasamdump
function, which you use to download the files from SRA (Sequence Read Archive) [1].
Creation
Description
creates an
sraOpt
= SRASAMDumpOptionsSRASAMDumpOptions
object with default property values.
SRASAMDumpOptions
requires the SRA Toolkit for Bioinformatics Toolbox™. If this support package is not installed, then the function provides a download
link. For details, see Bioinformatics Toolbox Software Support Packages.
sets
the object properties using
one or more name-value arguments. For example, when creating the
sraOpt
= SRASAMDumpOptions(Name=Value)SRASAMDumpOptions
object, specify BZip2=true
to set
the value of the BZip2
property to true
, so that
the output files are compressed using bzip2.
specifies optional parameters using a string scalar or character vector
sraOpt
= SRASAMDumpOptions(S
)S
.
Input Arguments
S
— srasamdump
options
character vector | string scalar
srasamdump
options, specified as a character vector or string
scalar. S
must be in the original sam-dump
option syntax (prefixed by one or two dashes).
Example: '--aligned-region chr20:2500000-2600000'
Data Types: char
| string
Properties
BZip2
— Flag to compress output files using bzip2
false
or 0 (default) | true
or 1
Flag to compress the output files using bzip2, specified as a numeric or logical 1
(true
) or 0 (false
).
Data Types: double
| logical
ExtraCommand
— Additional commands
""
(default) | character vector | string scalar
Additional commands, specified as a character vector or string scalar.
The commands must be in the native syntax (prefixed by one or two dashes). Use this option to apply undocumented flags and flags without corresponding MATLAB® properties.
Example: ExtraCommand="--aligned-region
chr20:2500000-2600000"
Data Types: char
| string
FastaOutput
— Flag to produce FASTA-formatted output files
false
or 0 (default) | true
or 1
Flag to produce FASTA-formatted output files, specified as a numeric or logical 1 (true
) or 0 (false
).
Data Types: double
| logical
FastqOutput
— Flag to produce FASTQ-formatted output files
false
or 0 (default) | true
or 1
Flag to produce FASTQ-formatted output files, specified as a numeric or logical 1 (true
) or 0 (false
).
Data Types: double
| logical
GZip
— Flag to compress output files using gzip
false
or 0 (default) | true
or 1
Flag to compress the output files using gzip, specified as a numeric or logical 1
(true
) or 0 (false
).
Data Types: double
| logical
HideIdentical
— Flag to use '='
if base is identical to reference
false
or 0 (default) | true
or 1
Flag to use '='
in the output if a base is identical to the
reference, specified as a numeric or logical 1 (true
) or 0
(false
).
Data Types: double
| logical
IncludeAll
— Flag to include all object properties
false
or 0 (default) | true
or 1
Flag to include all object properties with
corresponding default values when converting properties to the original option syntax,
specified as a numeric or logical 1 (true
) or 0
(false
). You can convert properties to the original syntax
prefixed by one or two dashes (such as '--aligned-region
chr20:2500000-2600000'
) by using the getCommand
function.
When IncludeAll=false
and you call
getCommand(optionsObject)
, the software converts only the
specified properties. If the value is true
,
getCommand
converts all available properties, using default
values for unspecified properties, to the original syntax.
Note
If you set IncludeAll
to true
, the
software translates all available properties, with default values for
unspecified properties. The only exception is that when the default value of a
property is NaN
, Inf
,
[]
, ''
, or ""
, then
the software does not translate the corresponding property.
Data Types: logical
MinMapQuality
— Minimum mapping quality
0 (default) | nonnegative scalar
Minimum mapping quality required for an alignment to be included in the output, specified as a nonnegative scalar.
Data Types: double
OutputFileName
— Output filename
empty string array (default) | character vector | string scalar
Output filename, specified as a character vector or string scalar.
Data Types: char
| string
OutputPrimary
— Flag to output primary alignments only
false
or 0 (default) | true
or 1
Flag to output primary alignments only, specified as a numeric or logical 1 (true
) or 0 (false
).
Data Types: double
| logical
OutputUnaligned
— Flag to output unaligned reads with aligned reads
false
or 0 (default) | true
or 1
Flag to output the unaligned reads with the aligned reads, specified as a numeric or
logical 1 (true
) or 0 (false
).
Data Types: double
| logical
Version
— Supported version
string scalar
This property is read-only.
Supported version of the original sam-dump
software, returned as a string
scalar.
Data Types: string
Object Functions
getCommand | Translate object properties to original options syntax |
getOptionsTable | Return table with all properties and equivalent options in original syntax |
Examples
Download NGS Data from SRA
Download some paired-end sequencing data in a FASTQ format using an accession run number SRR11846824
that has two reads per spot and has no unaligned reads. Downloading the data may take a few minutes.
tbl = srafasterqdump("SRR11846824")
tbl=1×2 table
Reads_1 Reads_2
_____________________ _____________________
SRR11846824 "SRR11846824_1.fastq" "SRR11846824_2.fastq"
By default, the function uses the SplitType="SplitThree"
option and downloads only biological reads. Specifically, the function splits spots into reads. For spots having two reads, the function produces *_1.fastq
and *_2.fastq
, represented by the Reads_1 and Reads_2 columns. If there are any unaligned reads, the function saves unaligned reads in a *.fastq
file, which would be represented by the Reads column. Because there are no unaligned reads within this accession, the function did not produce a *.fastq
file, and the output table has no Reads column. For details, see SplitType.
You can also specify other download options using SRAFasterqDumpOptions
. For instance, use FastaOutput=true
to get the FASTA-formatted file.
sraopt = SRAFasterqDumpOptions;
sraopt.FastaOutput = true;
tbl2 = srafasterqdump("SRR11846824",sraopt);
Alternatively, you can specify the options as name-value arguments instead of using the options object.
tbl2 = srafasterqdump("SRR11846824",FastaOutput=true);
You can also download the data in a SAM format using srasamdump
.
samFile = srasamdump("SRR11846824")
samFile = "SRR11846824.sam"
Specify the download options using an SRASAMDumpOptions
object. For instance, specify the output file name and compress the output file using bzip2
.
samdumpopt = SRASAMDumpOptions;
samdumpopt.OutputFileName = "SRR11846824.sam.bz2";
samdumpopt.BZip2 = true
samdumpopt = SRASAMDumpOptions with properties: Default properties: ExtraCommand: "" FastaOutput: 0 FastqOutput: 0 GZip: 0 HideIdentical: 0 IncludeAll: 0 MinMapQuality: 0 OutputPrimary: 0 OutputUnaligned: 0 Version: "3.0.6" Modified properties: OutputFileName: "SRR11846824.sam.bz2" BZip2: 1
bzFile = srasamdump("SRR11846824",samdumpopt)
bzFile = "SRR11846824.sam.bz2"
After downloading the SAM file, you can use it for downstream analyses. For instance, you can use bowtie2
to map the reads to the reference sequence.
First, download the C. elegans reference sequence.
celegans_refseq = fastaread("https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/ce11/ce11.fa");
Save Chromosome 3 reference data in a FASTA file.
celegans_chr3 = celegans_refseq(3).Sequence; warnState = warning; warning('off','Bioinfo:fastawrite:AppendToFile'); fastawrite("celegans_chr3.fa",celegans_chr3); warning(warnState);
Build a set of index files using bowtie2build
. The status value of 0 means that the build was successful.
status = bowtie2build("celegans_chr3.fa","celegans_chr3_index");
Align read data to the reference. This may take a few minutes.
bowtie2("celegans_chr3_index","SRR11846824_1.fastq","SRR11846824_2.fastq","SRR11846824_mapped.sam");
Create a quality control plot for the SAM file. Note that, for this particular experiment, most of the reads happen to have the same quality score of 30.
seqqcplot("SRR11846824_mapped.sam");
Convert the SAM file to a BAM file. Suppress two informational warnings that are issued while creating a BioMap
object.
w = warning; warning("off","bioinfo:BioMap:BioMap:UnsortedReadsInSAMFile"); warning("off","bioinfo:saminfo:InvalidTagField"); bmObj = BioMap("SRR11846824_mapped.sam"); write(bmObj,"SRR11846824_mapped.bam",Format="BAM"); warning(w);
Visualize the alignment data in the Genomics Viewer app. The corresponding cytoband file is provided with the toolbox.
gv = genomicsViewer(ReferenceFile="celegans_chr3.fa",CytoBand="celegans_cytoBandIdeo.txt.gz"); addTracks(gv,"SRR11846824_mapped.bam");
Use the zoom slider to zoom in and see the features. Or you can enter the following in the search text box: Generated:3,711,861-3,711,940
.
You may delete the downloaded files, such as the reference sequence file.
delete celegans_chr3.fa
Close the app.
close(gv);
References
[1] SRA Toolkit Development Team https://github.com/ncbi/sra-tools/wiki/01.-Downloading-SRA-Toolkit
Version History
Introduced in R2024a
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)