Label Spoken Words in Audio Signals
This example shows how to label spoken words in Signal Labeler. The example uses the IBM® Watson Speech to Text API and Audio Toolbox™ software. Speech-to-Text Transcription (Audio Toolbox) provides instructions for:
Downloading the Audio Toolbox
speech2textextended functionality, available from MATLAB® Central.
Setting up the IBM Watson Speech API, offered through IBM Cloud Services. You must create an IBM Cloud account, a Speech to Text service instance, and go to the service dashboard and copy your credentials – API Key and URL values. For more details, see the Getting Started Tutorial in the IBM documentation.
Set Up Speech Client
To perform speech-to-text transcription, you must create a
speechClient object. Make sure the
speech2text P-code files and the JSON file that stores your IBM Cloud credentials are in the current folder or on the MATLAB path.
speechObjectIBM = speechClient("IBM",timestamps=true);
Load an audio data file containing the sentence "Oak is strong, and also gives shade" spoken by a male voice. The signal is sampled at 44.1 kHz.
[y,fs] = audioread("oak.m4a"); % To hear, type soundsc(y,fs)
Open Signal Labeler and define a label to attach to the signal. Click Add on the Labeler tab, then Add Label Definition. Specify the Label Name as
Words, select a Label Type of
ROI, and select a Data Type of
Import Speech Data
Import the signal into the app.
On the Labeler tab, click Import and select
From Workspacein the Members list. In the dialog box, select the signal
Add time information. Select
Timefrom the drop-down list and specify
fsas the sample rate, which is measured in Hz.
Click Import and Close. The signal appears in the Labeled Signal Set Browser.
Locate and Identify Spoken Words
Locate and identify the words spoken in the input signal.
Wordsin the Label Definitions browser.
On the Automated Value gallery, select
Speech to Text.
Click Auto-Label and select
Auto-Label All Signals.
In the dialog box, select
IBMfrom the Service Name list and select the check box next to Segment Words.
Signal Labeler locates and labels the spoken words. In the Labeled Signal Set Browser, select the check box next to
y to plot the signal. Expand
Words and select the check box next to each word to visualize the corresponding labeled region.
Export Labeled Signal
Export the labeled signal. On the Labeler tab, click Export and select
To File from the Labeled Signal Set list. In the dialog box that appears, give the name
Transcription.mat to the labeled signal set and add an optional short description. Click Export.
Go back to the MATLAB Command Window. Load the labeled signal set. The set has only one member. Get the names of the labels, and use the name to obtain and display the transcribed words.
load Transcription ln = getLabelNames(transcribedAudio); v = getLabelValues(transcribedAudio,1,ln)
v=7×2 table ROILimits Value ____________ ________ 0.09 0.56 "oak" 0.59 0.97 "is" 1 1.78 "strong" 1.94 2.19 "and" 2.22 2.67 "also" 2.67 3.22 "gives" 3.25 3.91 "shade"
Change the label values from strings to categories. Use a
signalMask object to plot the signal using a different color for each word.
v.Value = categorical(v.Value,v.Value); msk = signalMask(v,SampleRate=fs); s = getSignal(transcribedAudio,1); plotsigroi(msk,s.y)
Create a logical vector of the same length as the audio signal. Set to
true the signal segment where the speaker utters the word "is."
bsl = binmask(msk,height(s)); plot(s.Time,[s.y bsl(:,v.Value=="is")])
- Label Signal Attributes, Regions of Interest, and Points
- Label ECG Signals and Track Progress
- Examine Labeled Signal Set
- Automate Signal Labeling with Custom Functions
- Using Signal Labeler App
- Import Data into Signal Labeler
- Create or Import Signal Label Definitions
- Label Signals Interactively or Automatically
- Custom Labeling Functions
- Customize Labeling View
- Feature Extraction Using Signal Labeler
- Export Labeled Signal Sets and Signal Label Definitions
- Signal Labeler Usage Tips