Main Content

Pretrained Networks

Transfer learning, sound classification, feature embeddings

Audio Toolbox™ provides the pretrained VGGish and YAMNet networks. Use the vggish and yamnet functions in MATLAB® and the YAMNet block in Simulink® to interact directly with the pretrained networks. The classifySound function in MATLAB and the Sound Classifier block in Simulink perform required preprocessing and postprocessing for YAMNet so that you can locate and classify sounds into one of 521 categories. The Sound Classifier block is equivalent to the cascade of YAMNet Preprocess block and YAMNet block. You can explore the YAMNet ontology using the yamnetGraph function. The vggishFeatures function performs the necessary preprocessing and postprocessing for VGGish so that you can extract feature embeddings to input to machine learning and deep learning systems.

This functionality requires Deep Learning Toolbox™.

Functions

expand all

vggishFeaturesExtract VGGish features
vggishVGGish neural network
vggishPreprocessPreprocess audio for VGGish feature extraction
classifySoundClassify sounds in audio signal
yamnetYAMNet neural network
yamnetGraphGraph of YAMNet AudioSet ontology
yamnetPreprocessPreprocess audio for YAMNet classification
openl3OpenL3 neural network
openl3PreprocessPreprocess audio for OpenL3 feature extraction
openl3FeaturesExtract OpenL3 features
crepeCREPE neural network
crepePreprocessPreprocess audio for CREPE deep learning network
crepePostprocessPostprocess output of CREPE deep learning network
pitchnnEstimate pitch with deep learning neural network
ivectorSystemCreate i-vector system
speakerRecognitionPretrained speaker recognition system

Blocks

Sound ClassifierClassify sounds in audio signal
YAMNetYAMNet sound classification network
YAMNet PreprocessPreprocess audio for YAMNet classification

Featured Examples