How can speech be converted to text?
15 次查看（过去 30 天）
Here is the flowchart of my system: User speaks >> Speech to Text conversion >> Text is sent to chatGPT >> Process ends.
My question is regarding the "Speech to Text" block: Is the "Audio Toolbox" sufficient for this task, or is an external API like the Google Speech API also required?
Furthermore, does the "Audio Toolbox" support multiple languages, or is it limited to English only?
Govind KM 2023-6-2
As per documentation, Audio Toolbox enables you to interface with third-party speech-to-text APIs from MATLAB, requiring extended Audio Toolbox functionality available from File Exchange, and one of the following APIs : Google Speech, IBM Watson Speech, Microsoft Azure Speech, or Amazon Transcribe (Amazon Transcribe requires R2022b or later).
Starting in MATLAB R2022b, you can use convert speech to text using a pretrained wav2vec 2.0 model that does not require access to an external API, and without needing to download extended Audio Toolbox functionality from File Exchange. Using the wav2vec2.0 model will require the Deep Learning Toolbox. You can also perform speech transcription interactively using the Signal Labeler app.
You can refer to these documentation links for further information on using these tools: