Main Content

text2speech

Synthesize speech from text

Since R2022b

    Description

    example

    [speech,fs] = text2speech(clientObj,text) synthesizes a speech signal from the provided text. text2speech interfaces with third-party speech services (Google®, IBM®, Microsoft®, or Amazon®) to perform the synthesis.

    Note

    To use text2speech, you must download the extended Audio Toolbox™ functionality from File Exchange. The File Exchange submission includes a tutorial to get started with the third-party services.

    [speech,fs] = text2speech(___,HTTPTimeout=timeout) specifies the time in seconds to wait for the initial server connection to the third-party speech service.

    [speech,fs,rawOutput] = text2speech(___) also returns the unprocessed server output from the third-party speech service.

    Examples

    collapse all

    Create a speechClient object that interfaces with the IBM Watson Text to Speech service.

    synthesizer = speechClient("IBM");

    Call text2speech with a string to synthesize a speech signal.

    [speech,fs] = text2speech(synthesizer,"hello world");

    Listen to the synthesized speech.

    soundsc(speech,fs)

    Input Arguments

    collapse all

    Client object, specified as an object returned by speechClient. The object is an interface to a third-party speech service.

    You cannot use text2speech with a speechClient object that interfaces with the wav2vec 2.0 pretrained model.

    To use the third-party speech services, you must download the extended Audio Toolbox functionality from File Exchange. The File Exchange submission includes a tutorial to get started with the third-party services.

    Example: speechClient("IBM")

    Text to synthesize into speech, specified as a string or character array.

    Example: "Hello world"

    Data Types: char | string

    Time to wait for initial server connection in seconds, specified as a positive scalar. This sets the TimeOut property of clientObj.

    Output Arguments

    collapse all

    Synthesized speech signal, returned as a column vector (single channel).

    Data Types: double

    Sample rate of speech signal in Hz, returned as a positive double. The sample rate depends on the third-party service and the server options set through the clientObj. See the documentation for the specific speech service for more information.

    Data Types: double

    Unprocessed server output, returned as a matlab.net.http.ResponseMessage object containing the HTTP response from the third-party speech service. If the third-party speech service is Amazon, text2speech returns the server output as a structure.

    Version History

    Introduced in R2022b