Main Content

mattest

Two-sample t-test to evaluate differential expression of genes from two experimental conditions or phenotypes

Description

pvalues = mattest(xdata,ydata) performs an unpaired t-test for differential expression with a standard two-tailed and two-sample t-test on every gene in xdata and ydata. The function returns a p-value for each gene.

xdata contains data from one experimental condition and ydata contains data from another experimental condition. For example, xdata could be expression values from cancer cells, and ydata could be expression values from normal cells.

xdata and ydata must have the same number of rows and are assumed to be normally distributed in each class with equal variances.

pvalues = mattest(xdata,ydata,Name=Value) specifies options using one or more name-value arguments in addition to the input arguments in the previous syntax. For example, to display a normal t-score quantile plot, set ShowPlot to true.

[pvalues,tscores] = mattest(___) returns a t-score for each gene in xdata and ydata. Specify any of the input argument combinations in the previous syntaxes.

example

[pvalues,tscores,dfs] = mattest(___) returns the degree of freedom for each gene in xdata and ydata. Specify any of the input argument combinations in the previous syntaxes.

Examples

collapse all

Load the Affymetrix® data from a prostate cancer study. The data is probe intensity data from Affymetrix HG-U133A GeneChip® arrays. A Bioinformatics Toolbox™ MAT file named prostatecancerexpdata contains the data.

load prostatecancerexpdata

Variables named dependentData and independentData provide two matrices of gene expression values from two experimental conditions.

Calculate the p-values and t-scores for the gene expression values in the two matrices. Display a normal t-score quantile plot.

[pv,ts] = mattest(dependentData,independentData,ShowPlot=true);

Figure contains an axes object. The axes object with title Normal Quantile Plot of t, xlabel Theoretical quantile, ylabel Sample quantile contains 3 objects of type line. One or more of the lines displays its values using only markers These objects represent Quantile, Significant, Diagonal.

Calculate the p-values and t-scores using 1000 permutation tests. Display histograms of t-score and p-value distributions.

[permpv,permts] = mattest(dependentData,independentData,...
    Permute=true,ShowHist=true);

Figure contains 2 axes objects and other objects of type subplottext. Axes object 1 with title t-scores, xlabel t-score, ylabel Frequency contains an object of type patch. Axes object 2 with title p-values, xlabel p-value, ylabel Frequency contains an object of type patch. This object represents pvalue.

Calculate the p-values and t-scores using 2000 bootstrap tests. Display histograms of t-score and p-value distributions.

[bootpv,bootts] = mattest(dependentData,independentData,...
    Bootstrap=2000,ShowHist=true);

Figure contains 2 axes objects and other objects of type subplottext. Axes object 1 with title t-scores, xlabel t-score, ylabel Frequency contains an object of type patch. Axes object 2 with title p-values, xlabel p-value, ylabel Frequency contains an object of type patch. This object represents pvalue.

The prostatecancerexpdata.mat file used in this example contains data from Best et al., 2005.

Input Arguments

collapse all

Gene expression data from one experimental condition, specified as a DataMatrix object or a matrix of gene expression values where each row corresponds to a gene and each column corresponds to a replicate.

xdata and ydata must have the same number of rows and are assumed to be normally distributed in each class with equal variances.

Gene expression data from another experimental condition, specified as a DataMatrix object or a matrix of gene expression values where each row corresponds to a gene and each column corresponds to a replicate.

xdata and ydata must have the same number of rows and are assumed to be normally distributed in each class with equal variances.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: mattest(xdata,ydata,Bootstrap=2000,ShowHist=true);

Variance type of the test, specified as "unequal" or "equal".

  • "unequal" — Perform the test assuming the two samples have unknown and unequal variances.

  • "equal" — Perform the test assuming the two samples have equal variances.

Permutation testing, specified as true, false, or an integer greater than 2.

When set to true, the number of permutations is 1000.

To specify how many permutation tests run, specify an integer greater than 2.

Bootstrap testing, specified as true, false, or an integer greater than 2.

When set to true, the number of bootstrap tests is 1000.

To specify how many bootstrap tests run, specify an integer greater than 2.

Histogram of t-score and p-value distributions, specified as true or false.

To display a histogram, set this argument to true.

Example histogram plots of t-scores and p-values

Normal t-score quantile plot, specified as true or false.

The black diagonal line represents the sample quantile being equal to the theoretical quantile.

Data points of genes considered to be differentially expressed lie farther away from this line. Specifically, data points with t-scores greater than (1-1/(2N)) or less than 1/(2N) display with red circles. N is the total number of genes.

Example normal t-score quantile plot

Labels such as gene names or probe set IDs, specified as a cell array of character vectors or string vector for each row in xdata and ydata. To display the labels, click a data point in the t-score quantile plot.

Output Arguments

collapse all

P-values for each gene, returned as a DataMatrix object or column vector.

  • When at least one input is a DataMatrix object, the output is a DataMatrix object with row names the same as the first input DataMatrix object and a column name of p-values.

  • When both inputs are matrices, the output is a column vector of p-values for each gene in xdata and ydata.

T-scores for each gene in xdata and ydata, returned as a column vector.

Degree of freedom for each gene in xdata and ydata, returned as a column vector.

References

[1] Review Literature: Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A., and Vingron, M. (2002). Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18 (Suppl. 1), S96–S104.

[2] Best, C.J.M., Gillespie, J.W., Yi, Y., Chandramouli, G.V.R., Perlmutter, M.A., Gathright, Y., Erickson, H.S., Georgevich, L., Tangrea, M.A., Duray, P.H., Gonzalez, S., Velasco, A., Linehan, W.M., Matusik, R.J., Price, D.K., Figg, W.D., Emmert-Buck, M.R., and Chuaqui, R.F. (2005). Molecular alterations in primary prostate cancer after androgen ablation therapy. Clinical Cancer Research 11, 6823–6834.

Version History

Introduced in R2006a