i m supposed to implement testing code for speech recognition..n getting error in line 154 as matrix dimensions must agree and 102 as index exceed matrix dimensions.....

1 次查看(过去 30 天)
clc;
clear all;
close all;
THRESHOLD=0.7;
Fs = 10000;
fprintf('say a sentence immediately after hitting enter: ');
input('');
y= wavrecord(1 * 10000, 10000, 'double'); % Record and store the uttered speech
t=(0:(1*10000)-1)*1/(1*10000);
subplot(5,1,1);
plot(y);
r=fft(y);
d=abs(r);
subplot(5,1,2);
plot(d);
z=floor(Fs/100);
w=floor(Fs/5);%according to formula, 1600 sample needed for 8 khz
%----------
%calculation of mean and std
h=[];
for i=1:w
h=[h y(i)];
end
meanVal=mean(h);
sDev=std(h);
%----------
%identify voiced or not for each value
for i=1:length(y)
if(abs(y(i)-meanVal)/sDev > THRESHOLD)
voiced(i)=1;
else
voiced(i)=0;
end
end
% identify voiced or not for each frame
%discard insufficient samples of last frame
usefulSamples=length(y)-mod(length(y),z);
frameCount=usefulSamples/z;
voicedFrameCount=0;
for i=1:frameCount
cVoiced=0;
cUnVoiced=0;
for j=i*z-z+1:1:(i*z)
if(voiced(j)==1)
cVoiced=(cVoiced+1);
else
cUnVoiced=cUnVoiced+1;
end
end
%mark frame for voiced/unvoiced
if(cVoiced>cUnVoiced)
voicedFrameCount=voicedFrameCount+1;
voicedUnvoiced(i)=1;
else
voicedUnvoiced(i)=0;
end
end
k=[];
%-----
for i=1:frameCount
if(voicedUnvoiced(i)==1)
for j=i*z-z+1:1:(i*z)
k= [k y(j)];
end
end
end
%---display plot and play both sounds
subplot(5,1,3);
plot(k);
g=fft(k);
a=hamming(4000);% Hamming window to smooth the speech signal
b= [a ;zeros(6000,1)];
f = (1:10000);
mel(f) = 2595 * log(1 + f / 700); % Linear to Mel frequency scale conversion
tri = triang(100);
win1 = [tri ; zeros(9900,1)]; % Defining overlapping triangular windows for
win2 = [zeros(50,1) ; tri ; zeros(9850,1)]; % frequency domain analysis
win3 = [zeros(100,1) ; tri ; zeros(9800,1)];
win4 = [zeros(150,1) ; tri ; zeros(9750,1)];
win5 = [zeros(200,1) ; tri ; zeros(9700,1)];
win6 = [zeros(250,1) ; tri ; zeros(9650,1)];
win7 = [zeros(300,1) ; tri ; zeros(9600,1)];
win8 = [zeros(350,1) ; tri ; zeros(9550,1)];
win9 = [zeros(400,1) ; tri ; zeros(9500,1)];
win10 = [zeros(450,1) ; tri ; zeros(9450,1)];
win11 = [zeros(500,1) ; tri ; zeros(9400,1)];
win12 = [zeros(550,1) ; tri ; zeros(9350,1)];
win13 = [zeros(600,1) ; tri ; zeros(9300,1)];
win14 = [zeros(650,1) ; tri ; zeros(9250,1)];
win15 = [zeros(700,1) ; tri ; zeros(9200,1)];
win16 = [zeros(750,1) ; tri ; zeros(9150,1)];
win17 = [zeros(800,1) ; tri ; zeros(9100,1)];
win18 = [zeros(850,1) ; tri ; zeros(9050,1)];
win19 = [zeros(900,1) ; tri ; zeros(9000,1)];
win20 = [zeros(950,1) ; tri ; zeros(8950,1)];
ny = abs(g(floor(mel(f)))); % Mel warping
ny = ny / max(ny);
ny1 = ny * win1;
ny2 = ny * win2;
ny3 = ny * win3;
ny4 = ny * win4;
ny5 = ny * win5;
ny6 = ny * win6;
ny7 = ny * win7;
ny8 = ny * win8;
ny9 = ny * win9;
ny10 = ny * win10;
ny11 = ny * win11;
ny12 = ny *win12;
ny13 = ny * win13;
ny14 = ny * win14;
ny15 = ny * win15;
ny16 = ny * win16;
ny17 = ny * win17;
ny18 = ny * win18;
ny19 = ny * win19;
ny20 = ny * win20;
sy1 = sum(ny1 ^ 2); % Determine the energy of the signal within each window
sy2 = sum(ny2 ^ 2); % by summing square of the magnitude of the spectrum
sy3 = sum(ny3 ^ 2);
sy4 = sum(ny4 ^ 2);
sy5 = sum(ny5 ^ 2);
sy6 = sum(ny6 ^ 2);
sy7 = sum(ny7 ^ 2);
sy8 = sum(ny8 ^ 2);
sy9 = sum(ny9 ^ 2);
sy10 = sum(ny10 ^ 2);
sy11 = sum(ny11 ^ 2);
sy12 = sum(ny12 ^ 2);
sy13 = sum(ny13 ^ 2);
sy14 = sum(ny14 ^ 2);
sy15 = sum(ny15 ^ 2);
sy16 = sum(ny16 ^ 2);
sy17 = sum(ny17 ^ 2);
sy18 = sum(ny18 ^ 2);
sy19 = sum(ny19 ^ 2);
sy20 = sum(ny20 ^ 2);
sy = [sy1; sy2; sy3; sy4; sy5; sy6; sy7; sy8; sy9; sy10; sy11; sy12; sy13; sy14;
sy15; sy16; sy17; sy18; sy19; sy20];
by = log(sy);
dy = dct(by); % Determine DCT of Log of the spectrum energies
subplot(5,1,4);
plot(dy);
fid = fopen('sample.dat','r');
dx = fread(fid,20, 'real*8'); % Obtain the feature vector for the password
fclose(fid); % evaluated in the training phase
dx=dx';
MSE=(sum((dx - dy) ^ 2)) / 20; % Determine the Mean squared error
if MSE<1
fprintf('\n\nACCESS GRANTED\n\n');
Grant=wavread('Grant.wav'); % “Access Granted”
wavplay(Grant);
else
fprintf('\n\nACCESS DENIED\n\n');
Deny=wavread('Deny.wav'); % “Access Denied” is output in case of a failure
wavplay(Deny);
end;

回答(1 个)

Walter Roberson
Walter Roberson 2012-2-21
Without calculating it through, I see no inherent reason to expect that
floor(2595 * log(1 + (1:10000) / 700))
will always be in the range 1:length(g) as g = fft(k) and k is built up conditionally based upon an unusual windowing function applied for "unvoiced" signals.

类别

Help CenterFile Exchange 中查找有关 Simulation, Tuning, and Visualization 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by