summary
(Not Recommended) Print summary of dataset array
The dataset
data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table
data type instead. See MATLAB
table
documentation for more information.
Syntax
summary(A)
s = summary(A)
Description
summary(A)
prints a summary of a dataset array and
the variables that it contains.
s = summary(A)
returns a scalar structure s
that
contains a summary of the dataset A
and the variables that
A
contains. For more information on the fields in s
,
see Outputs.
Summary information depends on the type of the variables in the data set:
For numerical variables,
summary
computes a five-number summary of the data, giving the minimum, the first quartile, the median, the third quartile, and the maximum.For logical variables,
summary
counts the number oftrue
s andfalse
s in the data.For categorical variables,
summary
counts the number of data at each level.
Output Arguments
The following list describes the fields in the structure s
:
Description
— A character array containing the dataset description.Variables
— A structure array with one element for each dataset variable in A. Each element has the following fields:Name
— A character vector containing the name of the variable.Description
— A character vector containing the variable's description.Units
— A character vector containing the variable's units.Size
— A numeric vector containing the size of the variable.Class
— A character vector containing the class of the variable.Data
— A scalar structure containing the following fields.For numeric variables:
Probabilities
— A numeric vector containing the probabilities [0.0 .25 .50 .75 1.0] and NaN (if any are present in the corresponding dataset variable).Quantiles
— A numeric vector containing the values that correspond to 'Probabilities' for the corresponding dataset variable, and a count of NaNs (if any are present).
For logical variables:
Values
— The logical vector [true false].Counts
— A numeric vector of counts for each logical value.
For categorical variables:
Levels
— A cell array containing the labels for each level of the corresponding dataset variable.Counts
— A numeric vector of counts for each level.
'Data'
is empty if variable is not numeric, categorical, or logical. If a dataset variable has more than one column, then the corresponding'Quantiles'
or'Counts'
field is a matrix or an array.
Examples
Summarize Fisher's iris data:
load fisheriris species = nominal(species); data = dataset(species,meas); summary(data) species: [150x1 nominal] setosa versicolor virginica 50 50 50 meas: [150x4 double] min 4.3000 2 1 0.1000 1st Q 5.1000 2.8000 1.6000 0.3000 median 5.8000 3 4.3500 1.3000 3rd Q 6.4000 3.3000 5.1000 1.8000 max 7.9000 4.4000 6.9000 2.5000
Summarize the data in hospital.mat
:
load hospital summary(hospital) Dataset array created from the data file hospital.dat. The first column of the file ("id") is used for observation names. Other columns ("sex" and "smoke") have been converted from their original coded values into categorical and logical variables. Two sets of columns ("sys" and "dia", "trial1" through "trial4") have been combined into single variables with multivariate observations. Column headers have been replaced with more descriptive variable names. Units have been added where appropriate. LastName: [100x1 cell array of character vectors] Sex: [100x1 nominal] Female Male 53 47 Age: [100x1 double, Units = Yrs] min 1st Q median 3rd Q max 25 32 39 44 50 Weight: [100x1 double, Units = Lbs] min 1st Q median 3rd Q max 111 130.5000 142.5000 180.5000 202 Smoker: [100x1 logical] true false 34 66 BloodPressure: [100x2 double, Units = mm Hg] Systolic/Diastolic min 109 68 1st Q 117.5000 77.5000 median 122 81.5000 3rd Q 127.5000 89 max 138 99 Trials: [100x1 cell, Units = Counts] From zero to four measurement trials performed