Main Content

本页翻译不是最新的。点击此处可查看最新英文版本。

音频处理

利用音频和语音处理应用扩展深度学习工作流

通过将 Audio Toolbox™ 与 Deep Learning Toolbox™ 结合使用,将深度学习应用于音频和语音处理应用。有关信号处理应用,请参阅信号处理。有关无线通信中的应用,请参阅无线通信

App

信号标注器Label signal attributes, regions, and points of interest, and extract features

函数

全部展开

audioDatastoreDatastore for collection of audio files
audioDataAugmenterAugment audio data (自 R2019b 起)
audioFeatureExtractorStreamline audio feature extraction (自 R2019b 起)
openl3EmbeddingsExtract OpenL3 feature embeddings (自 R2022a 起)
pitchnnEstimate pitch with deep learning neural network (自 R2021a 起)
vggishEmbeddingsExtract VGGish feature embeddings (自 R2022a 起)
yamnet(Not recommended) YAMNet neural network (自 R2020b 起)
classifySoundClassify sounds in audio signal (自 R2020b 起)
crepe(Not recommended) CREPE neural network (自 R2021a 起)
pitchnnEstimate pitch with deep learning neural network (自 R2021a 起)
vggish(Not recommended) VGGish neural network (自 R2020b 起)
vggishEmbeddingsExtract VGGish feature embeddings (自 R2022a 起)
openl3(Not recommended) OpenL3 neural network (自 R2021a 起)
openl3EmbeddingsExtract OpenL3 feature embeddings (自 R2022a 起)
vadnet(Not recommended) Voice activity detection (VAD) neural network (自 R2023a 起)
detectspeechnnDetect boundaries of speech in audio signal using AI (自 R2023a 起)
separateSpeakersSeparate signal by speakers (自 R2023b 起)

模块

全部展开

VGGishVGGish embeddings extraction network (自 R2022a 起)
VGGish EmbeddingsExtract VGGish embeddings (自 R2022a 起)
YAMNetYAMNet sound classification network (自 R2021b 起)
Sound ClassifierClassify sounds in audio signal (自 R2021b 起)
OpenL3OpenL3 embeddings extraction network (自 R2022b 起)
OpenL3 EmbeddingsExtract OpenL3 embeddings (自 R2022b 起)
CREPECREPE deep pitch estimation neural network (自 R2023a 起)
Deep Pitch EstimatorEstimate pitch with CREPE deep learning neural network (自 R2023a 起)

主题