bioinfo.pipeline.block.CuffDiff

Bioinformatics pipeline block to identify significant changes in transcript expression

Since R2023a

Description

A CuffDiff block enables you to identify significant changes in transcript expression between the samples.

The block requires the Cufflinks Support Package for the Bioinformatics Toolbox™. If the support package is not installed, then a download link is provided. For details, see Bioinformatics Toolbox Software Support Packages.

Creation

Syntax

b = bioinfo.pipeline.block.CuffDiff

b = bioinfo.pipeline.block.CuffDiff(options)

b = bioinfo.pipeline.block.CuffDiff(Name=Value)

Description

b = bioinfo.pipeline.block.CuffDiff creates a CuffDiff block.

example

b = bioinfo.pipeline.block.CuffDiff(options) also specifies additional options.

b = bioinfo.pipeline.block.CuffDiff(Name=Value) specifies additional options as the property names and values of a CuffDiffOptions object. This object is set as the value of the Options property of the block.

Input Arguments

expand all

`options` — CuffDiff options
`CuffDiffOptions` | string | character vector

CuffDiff options, specified as a CuffDiffOptions object, string, or character vector.

If you are specifying a string or character vector, it must be in the CuffDiff native syntax (prefixed by one or two dashes) [1].

Name-Value Arguments

expand all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Note

The following list of arguments is a partial list. For the complete list, refer to the properties of CuffDiffOptions object.

`ConditionLabels` — Sample labels
string | string vector | character vector | cell array of character vector

Sample labels, specified as a string, string vector, character vector, or cell array of character vectors. The number of labels must equal the number of samples or the value must be empty [].

Example: ["Control","Mutant1","Mutant2"]

Data Types: string | char | cell

`ContrastFile` — Contrast file name
string | character vector

Contrast file name, specified as a string or character vector. The file must be a two-column tab-delimited text file, where each line indicates two conditions to compare using cuffdiff. The condition labels in the file must match either the labels specified for ConditionLabels or the sample names. The file must have a single header line as the first line, followed by one line for each contrast. An example of the contrast file format follows.

condition_A	condition_B
Control	Mutant1
Control	Mutant2

If you do not provide this file, cuffdiff compares every pair of input conditions, which can impact performance.

Example: "contrast.txt"

Data Types: char | string

Properties

expand all

`ErrorHandler` — Function to handle errors from `run` method
function handle

Function to handle errors from the run method of the block, specified as a function handle. The handle specifies the function to call if the run method encounters an error within a pipeline. For the pipeline to continue after a block fails, ErrorHandler must return a structure that is compatible with the output ports of the block. The error handling function is called with the following two inputs:

Structure with these fields:

Field	Description
identifier	Identifier of the error that occurred
message	Text of the error message
index	Linear index indicating which block process failed in the parallel run. By default, the index is 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.

Input structure passed to the run method when it fails

Data Types: function_handle

`Inputs` — Input ports
Read-only: structure

This property is read-only.

Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports, and the field values are bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass to the block run method.

The CuffDiff block Inputs structure has the following fields:

GenomicAnnotationFile — Name of the transcript annotation file. The file can be a GTF or GFF file produced by Cufflinks, CuffCompare, or another source of GTF annotations. This input is a required input that must be satisfied.
GenomicAlignmentFiles — Names of SAM, BAM, or CXB files containing alignment records for each sample. This input is a required input that must be satisfied.

The default value for each input field is a bioinfo.pipeline.datatypes.Unset object, which means that the input value is not set yet.

Data Types: struct

`Outputs` — Output ports
Read-only: structure

This property is read-only.

Output ports of the block, specified as a structure. The field names of the structure are the names of the block output ports, and the field values are bioinfo.pipeline.Output objects. These objects describe the output port behaviors. The field names of the output structure returned by the block run method are the same as the output port names.

The CuffDiff block Outputs structure has the following fields:

IsoformDiffFile — Name of a file containing transcript-level differential expression results.
GeneDiffFile — Name of a file containing gene-level differential expression results.
TSSDiffFile — Name of a file containing primary transcript differential expression results.
CDSExpDiffFile — Name of a file containing coding sequence differential expression results.
SplicingDiffFile — Name of a file containing differential splicing results for isoforms.
CDSDiffFile — Name of a file containing differential coding sequence output.
PromotersDiffFile — Name of a file containing information on differential promoter use that exists between samples.

Tip

To see the actual location of these files, first get the results of the block. Then use the unwrap method as shown in this example.

Data Types: struct

`Options` — CuffDiff options
`CuffDiffOptions` object (default)

CuffDiff options, specified as a CuffDiffOptions object. The default value is a default CuffDiffOptions object.

Object Functions

`compile`	Perform block-specific additional checks and validations
`copy`	Copy array of handle objects
`emptyInputs`	Create input structure for use with `run` method
`eval`	Evaluate block object
`run`	Run block object

Examples

collapse all

Use `CuffDiff` Block to Perform Differential Expression

Perform differential expression transcripts using the provided SAM files which contain aligned reads from Mycoplasma pneumoniae from two samples.

import bioinfo.pipeline.block.*
import bioinfo.pipeline.Pipeline

FC1 = FileChooser(which("gyrAB.gtf"));
samFiles = {which("Myco_1_1.sam"),which("Myco_1_2.sam")};
FC2 = FileChooser(samFiles);
CD = CuffDiff;

P = Pipeline;
addBlock(P,[FC1,FC2,CD]);
connect(P,FC1,CD,["Files","GenomicAnnotationFile"]);
connect(P,FC2,CD,["Files","GenomicAlignmentFiles"]);

run(P);
R = results(P,CD)

R = 

  struct with fields:

      IsoformDiffFile: [1×1 bioinfo.pipeline.datatypes.File]
         GeneDiffFile: [1×1 bioinfo.pipeline.datatypes.File]
          TSSDiffFile: [1×1 bioinfo.pipeline.datatypes.File]
       CDSExpDiffFile: [1×1 bioinfo.pipeline.datatypes.File]
     SplicingDiffFile: [1×1 bioinfo.pipeline.datatypes.File]
          CDSDiffFile: [1×1 bioinfo.pipeline.datatypes.File]
    PromotersDiffFile: [1×1 bioinfo.pipeline.datatypes.File]

Call unwrap on each field of the result structure R to see the location of each output file. For example, to see the location of IsoformDiffFile, enter the following.

unwrap(R.IsoformDiffFile)

ans = 

    "C:\PipelineResults\CuffDiff_1\1\isoform_exp.diff"

References

[1] Trapnell, Cole, Brian A Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J van Baren, Steven L Salzberg, Barbara J Wold, and Lior Pachter. “Transcript Assembly and Quantification by RNA-Seq Reveals Unannotated Transcripts and Isoform Switching during Cell Differentiation.” Nature Biotechnology 28, no. 5 (May 2010): 511–15.

Version History

Introduced in R2023a

bioinfo.pipeline.block.CuffDiff

Description

Creation

Syntax

Description

Input Arguments

options — CuffDiff options CuffDiffOptions | string | character vector

Name-Value Arguments

ConditionLabels — Sample labels string | string vector | character vector | cell array of character vector

ContrastFile — Contrast file name string | character vector

Properties

ErrorHandler — Function to handle errors from run method function handle

Inputs — Input ports Read-only: structure

Outputs — Output ports Read-only: structure

Options — CuffDiff options CuffDiffOptions object (default)

Object Functions

Examples

Use CuffDiff Block to Perform Differential Expression

References

Version History

See Also

`options` — CuffDiff options
`CuffDiffOptions` | string | character vector

`ConditionLabels` — Sample labels
string | string vector | character vector | cell array of character vector

`ContrastFile` — Contrast file name
string | character vector

`ErrorHandler` — Function to handle errors from `run` method
function handle

`Inputs` — Input ports
Read-only: structure

`Outputs` — Output ports
Read-only: structure

`Options` — CuffDiff options
`CuffDiffOptions` object (default)

Use `CuffDiff` Block to Perform Differential Expression