extract

Extract audio features

collapse all in page

Syntax

features = extract(aFE,audioIn)

features = extract(aFE,ds)

features = extract(aFE,ds,Name=Value)

Description

features = extract(aFE,audioIn) returns an array containing features of the audio input.

example

features = extract(aFE,ds) extracts features from all of the audio files in the audioDatastore object ds.

features = extract(aFE,ds,Name=Value) specifies options using one or more name-value arguments. For example, extract(aFE,ds,UseParallel=true) reads the data and extracts features in parallel.

Examples

collapse all

Extract and Normalize Audio Features

Open Live Script

Read in an audio signal.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");

Create an audioFeatureExtractor to extract the centroid of the Bark spectrum, the kurtosis of the Bark spectrum, and the pitch of an audio signal.

aFE = audioFeatureExtractor("SampleRate",fs, ...
    "SpectralDescriptorInput","barkSpectrum", ...
    "spectralCentroid",true, ...
    "spectralKurtosis",true, ...
    "pitch",true)

aFE = 
  audioFeatureExtractor with properties:

   Properties
                     Window: [1024×1 double]
              OverlapLength: 512
                 SampleRate: 44100
                  FFTLength: []
    SpectralDescriptorInput: 'barkSpectrum'
        FeatureVectorLength: 3

   Enabled Features
     spectralCentroid, spectralKurtosis, pitch

   Disabled Features
     linearSpectrum, melSpectrum, barkSpectrum, erbSpectrum, mfcc, mfccDelta
     mfccDeltaDelta, gtcc, gtccDelta, gtccDeltaDelta, spectralCrest, spectralDecrease
     spectralEntropy, spectralFlatness, spectralFlux, spectralRolloffPoint, spectralSkewness, spectralSlope
     spectralSpread, harmonicRatio, zerocrossrate, shortTimeEnergy


   To extract a feature, set the corresponding property to true.
   For example, obj.mfcc = true, adds mfcc to the list of enabled features.

Call extract to extract the features from the audio signal. Normalize the features by their mean and standard deviation.

features = extract(aFE,audioIn);
features = (features - mean(features,1))./std(features,[],1);

Plot the normalized features over time.

idx = info(aFE);
duration = size(audioIn,1)/fs;

subplot(2,1,1)
t = linspace(0,duration,size(audioIn,1));
plot(t,audioIn)

subplot(2,1,2)
t = linspace(0,duration,size(features,1));
plot(t,features(:,idx.spectralCentroid), ...
     t,features(:,idx.spectralKurtosis), ...
     t,features(:,idx.pitch));
legend("Spectral Centroid","Spectral Kurtosis", "Pitch")
xlabel("Time (s)")

Figure contains 2 axes objects. Axes object 1 contains an object of type line. Axes object 2 with xlabel Time (s) contains 3 objects of type line. These objects represent Spectral Centroid, Spectral Kurtosis, Pitch.

Extract Features from Data Set

Open Live Script

Create an audio datastore that points to audio samples included with Audio Toolbox™.

folder = fullfile(matlabroot,"toolbox","audio","samples");
ads = audioDatastore(folder);

Create an audioFeatureExtractor object to extract the mel spectrum, Bark spectrum, ERB spectrum, and linear spectrum from each audio file. Use the default analysis window and overlap length for the spectrum extraction.

aFE = audioFeatureExtractor(SampleRate=44.1e3, ...
    melSpectrum=true, ...
    barkSpectrum=true, ...
    erbSpectrum=true, ...
    linearSpectrum=true);

Call extract to extract the features from each audio file in the datastore. Specify SampleRateMismatchRule as "resample" to resample the audio files in the datastore if they do not match 44.1 kHz, the sample rate of the audioFeatureExtractor object. If you have Parallel Computing Toolbox™, specify UseParallel as true to read the files and extract the features in parallel.

specs = extract(aFE,ads,SampleRateMismatchRule="resample",UseParallel=true);

Starting parallel pool (parpool) using the 'Processes' profile ...
17-Dec-2024 09:28:59: Job Queued. Waiting for parallel pool job with ID 3 to start ...
Connected to parallel pool with 4 workers.

The specs variable is a numFiles-by-1 cell array, where numFiles is the number of files in the datastore. Each element of the cell array is a numHops-by-numFeatures-by-numChannels array, where the number of hops and number of channels depends on the length and number of channels of the audio file, and the number of features is the requested number of features from the audio data.

numFiles = numel(specs)

numFiles = 
39

[numHops1,numFeaturesFile1,numChanelsFile1] = size(specs{1})

numHops1 = 
1053

numFeaturesFile1 = 
620

numChanelsFile1 = 
1

[numHops2,numFeaturesFile2,numChanelsFile2] = size(specs{2})

numHops2 = 
1724

numFeaturesFile2 = 
620

numChanelsFile2 = 
4

Input Arguments

collapse all

`aFE` — Input object
`audioFeatureExtractor` object

audioFeatureExtractor object.

`audioIn` — Input audio
column vector | matrix

Input audio, specified as a column vector or matrix of independent channels (columns).

Data Types: single | double

`ds` — Audio datastore
`audioDatastore` object

Audio datastore to extract features from, specified as an audioDatastore object.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: extract(aFE,ds,SampleRateMismatchRule="resample")

`UseParallel` — Read data and extract features in parallel
`false` (default) | `true`

Read data and extract features from the audioDatastore in parallel. If you specify true, extract reads the data and extracts features using a pool of parallel workers. For more information on parallel pools, see parpool (Parallel Computing Toolbox).

This functionality requires Parallel Computing Toolbox™.

Data Types: logical

`SampleRateMismatchRule` — Behavior when sample rate does not match
`"error"` (default) | `"warn"` | `"resample"`

Behavior of the extract function when the sample rate of an audio file in the audioDatastore does not match the sample rate set on the audioFeatureExtractor object, specified as "error", "warn", or "resample".

"error" — Error immediately if there is a sample rate mismatch.
"warn" — Use the sample rate of the audioFeatureExtractor object and display a warning if the sample rate of any file does not match.
"resample" — If there is a mismatch, resample the audio data to match the sample rate of the audioFeatureExtractor object.

Data Types: char | string

Output Arguments

collapse all

`features` — Extracted audio features
vector | matrix | 3-D array | cell array

Extracted audio features, returned as an L-by-M-by-N array, where:

L –– Number of feature vectors (hops)
M –– Number of features extracted per analysis window
N –– Number of channels

If the input is an audioDatastore object, extract returns a cell array where each cell corresponds to an audio file and contains the extracted features from that file.

Data Types: single | double

Extended Capabilities

expand all

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

Introduced in R2019b

expand all

R2023b: Extract features from audio signals stored in `audioDatastore`

Pass an audioDatastore to extract to extract features from all audio files in the datastore.

extract

Syntax

Description

Examples

Extract and Normalize Audio Features

Extract Features from Data Set

Input Arguments

aFE — Input object audioFeatureExtractor object

audioIn — Input audio column vector | matrix

ds — Audio datastore audioDatastore object

Name-Value Arguments

UseParallel — Read data and extract features in parallel false (default) | true

SampleRateMismatchRule — Behavior when sample rate does not match "error" (default) | "warn" | "resample"

Output Arguments

features — Extracted audio features vector | matrix | 3-D array | cell array

Extended Capabilities

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

R2023b: Extract features from audio signals stored in audioDatastore

See Also

`aFE` — Input object
`audioFeatureExtractor` object

`audioIn` — Input audio
column vector | matrix

`ds` — Audio datastore
`audioDatastore` object

`UseParallel` — Read data and extract features in parallel
`false` (default) | `true`

`SampleRateMismatchRule` — Behavior when sample rate does not match
`"error"` (default) | `"warn"` | `"resample"`

`features` — Extracted audio features
vector | matrix | 3-D array | cell array

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

R2023b: Extract features from audio signals stored in `audioDatastore`