speechClient
Description
Use a speechClient
object to interface with a wav2vec 2.0
pretrained speech-to-text model or third-party cloud-based speech services. Use the object
with speech2text
or
text2speech
.
Note
To use speechClient
to interface with third-party speech
services, you must download the extended Audio Toolbox™ functionality from File
Exchange. The File Exchange submission includes a tutorial to get started with
the third-party services.
Using wav2vec 2.0 requires Deep Learning Toolbox™ and installing the pretrained model.
Creation
Description
sets Properties using one or more
name-value arguments.clientObj
= speechClient(___,Name=Value
)
Input Arguments
Output Arguments
Properties
Object Functions
Note
For the third-party speech services, you can configure server-specific options using the following functions. See the documentation for the specific service for option names and values.
setOptions | Set server options |
getOptions | Get server options |
clearOptions | Remove all server options |
Examples
References
[1] Baevski, Alexei, Henry Zhou, Abdelrahman Mohamed, and Michael Auli. “Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations,” 2020. https://doi.org/10.48550/ARXIV.2006.11477.
[2] Kürzinger, Ludwig, Dominik Winkelbauer, Lujun Li, Tobias Watzel, and Gerhard Rigoll. “CTC-Segmentation of Large Corpora for German End-to-End Speech Recognition.” In Speech and Computer, edited by Alexey Karpov and Rodmonga Potapova, 12335:267–78. Cham: Springer International Publishing, 2020. https://doi.org/10.1007/978-3-030-60276-5_27.
Version History
Introduced in R2022b