This example uses a pretrained speaker recognition system, 'ivec-english-16kHz'
. The 'ivec-english-16kHz'
system is an instance of ivectorSystem
trained on the LibriSpeech data set.
Download the pretrained speaker recognition system into your temporary directory, whose location is specified by the MATLAB® tempdir
command. If you want to place the data files in a folder different from tempdir
, change the directory name. Add the temporary directory to the search path. Create an i-vector system.
Read two speech signals, each of which contains the phrase "volume up" spoken out loud several times with different intonations. In one of the signals, the speaker is male. In the other signal, the speaker is female.
Read each signal and split it into two parts. One of the parts is used to enroll the speaker. The other part is used for speaker verification and identification.
Enroll the speakers into the speaker recognition system. This creates a template of the speaker that can be used for verification or identification.
Extracting i-vectors ...done.
Enrolling i-vectors .....done.
Enrollment complete.
Call the identify
function on the test data.
candidates=2×2 table
Label Score
_____ _________
BF 0.99474
RD 0.0017846
candidates=2×2 table
Label Score
_____ __________
RD 0.24113
BF 3.2741e-05
Call the verify
function with the test data to confirm that the system correctly accepts or rejects speakers.
Call the info
function to get information about how the model was trained.
Header
- This system was trained using the LibriSpeech train and development sets.
LibriSpeech is an approximately 1000-hour corpus of read English speech sampled at 16 kHz.
- The detection error tradeoff was determined by enrolling one file from each speaker in the
LibriSpeech test set, and then evaluating exhaustive pairs of the enrolled and remaining data.
- The system was calibrated using the train-clean-100 and dev-clean data of LibriSpeech.
i-vector system input
Input feature vector length: 60
Input data type: double
trainExtractor
Train signals: 286808
UBMNumComponents: 2048
UBMNumIterations: 10
TVSRank: 512
TVSNumIterations: 5
trainClassifier
Train signals: 286807
Train labels: 1 (91), 100043 (31) ... and 5652 more
NumEigenvectors: 200
PLDANumDimensions: 200
PLDANumIterations: 5
calibrate
Calibration signals: 31242
Calibration labels: 103 (102), 1034 (96) ... and 289 more
detectionErrorTradeoff
Evaluation signals: 5382
Evaluation labels: 102255 (46), 1066 (24) ... and 175 more
Remove the temporary directory from the search path.