Pretrained Models
Audio Toolbox™ provides MATLAB® and Simulink® support for pretrained audio deep learning networks.
Locate and classify sounds with YAMNet and estimate pitch with CREPE.
Extract VGGish or OpenL3 feature embeddings to input to machine learning
and deep learning systems. Use i-vector systems to produce compact
representations of audio signals for applications such as speaker
recognition, verification, identification, and diarization. Use
detectspeechnn
to perform voice activity detection
(VAD).
Using pretrained deep learning networks requires Deep Learning Toolbox™. The Audio Toolbox pretrained networks are available in Deep Network Designer (Deep Learning Toolbox).
Functions
Blocks
Apps
Deep Network Designer | Design and visualize deep learning networks |
Topics
- Audio Transfer Learning Using Experiment Manager
Configure an experiment that compares the performance of multiple pretrained networks applied to a speech command recognition task using transfer learning.
- Speaker Diarization Using Pretrained AI Models
Use the
speakerEmbeddings
function to extract compact speaker representations and perform speaker diarization. (Since R2024b) - Classify Human Voice Using YAMNet on Android Device (Simulink)
This example shows how to use the Simulink® Support Package for Android® Devices and a pretrained YAMNet network to classify human voices.