voiced / unvoiced exctraction of a speech signal , short time energy

5 次查看(过去 30 天)
Dear all , please I have the following script : I'm trying to exctract voiced / unvoiced portions in a speech signal , I performed windowing , short time energy and zero crossing , I find peaks of Energy frames ( more than mean it would be voiced ) , but I only got 5 peaks in the plot although the Energy vector has more elements higher than the mean value which is 0.7139 , attached the speech file for the utterance ( go / male ) as go.mat' , Fs=8000 , I used speech frame duration 15/1000 , I used number of overlapping points 60 , I start windowing at sample index #1 , also is it possible to add label to the original speech signal indicating voiced based on energy per frame ? , thanks
% This script will compute the Short Time of a windowed length L frame of the speech signal .
% The window would start at user defined index
close all
clear
clc
load go.mat
dt=1/Fs;
L=length(signal);
signal_duration=L/Fs; % duration of the whole speech signal in seconds
prompt = 'Enter the speech frame duration ';
frame_duration=input(prompt); % duration of short time frames in msec
frame_length=ceil(frame_duration*Fs); % Number of samples of the window
nfft = 2^nextpow2(frame_length); % Number of DFT points
w = hamming(frame_length); % type of window
prompt = 'Enter the number of overlapping points ';
n_overlap=input(prompt); % duration of short time frames in msec
prompt = ('Enter the sample index to begin the windowing ');
index=input(prompt); % duration of short time frames in msec
Nbr_frames=floor((L-n_overlap)/(frame_length-n_overlap)); % Total number of over-lapped frames which will divide the whole signal
signal_framed=zeros(L,Nbr_frames);
Y=zeros(nfft,Nbr_frames);
t=zeros(1,Nbr_frames);
zcd_signal = dsp.ZeroCrossingDetector;
x=frame_length-n_overlap;
for k=1:Nbr_frames
signal_framed(:,k)=[zeros(1,(k-1)*x) w' zeros(1,L-frame_length-(k-1)*x)]'.*signal; % Frameing
E(:,k)=sum(signal_framed(:,k).*signal_framed(:,k)); % energy per each over-lapped frame
numZeroCross(:,k) = zcd_signal(signal_framed(:,k)); % zero crossing per each over-lapped frame
t(:,k)=(k)'; % over-lapped frame index
end
% find the frame with the maximum energy (voiced frame )
find(E==max(E));
% find the frame with the maximum zero crossing ( unvoiced frame )
find(numZeroCross==max(numZeroCross));
for k=index:Nbr_frames
signal_framed(:,k)=[zeros(1,(k-1)*x) w' zeros(1,L-frame_length-(k-1)*x) ]'.*signal;
Y(:,k)=abs(fft(nonzeros(signal_framed(:,k)),nfft));
P1(:,k) = Y(1:nfft/2+1,k);
P1(2:end-1,k) = 2*P1(2:end-1,k);
end
[pks,locs] = findpeaks(E,1:Nbr_frames,'MinPeakHeight',mean(E)); % Find the peaks that have an amplitude of at least mean of frame energy
figure(1)
subplot(311) ; plot(1000*(0:dt:signal_duration-dt),signal_framed,'r') ; grid ; xlabel('Time in msec ') ; ylabel('Amplitude')
subplot(312) ; plot(1:L,signal_framed,'k') ; grid ; xlabel('sample index ') ; ylabel('Amplitude')
subplot(313) ; plot(1:Nbr_frames,E,'k') ;grid ;
findpeaks(E,1:Nbr_frames,'MinPeakHeight',mean(E));
legend('Signal','peaks indicate voiced frames '); xlabel('frame index ') ; ylabel('energy') ; title('energy per frame ')
figure(2)
% plot of voiced frames based on frame energy criterion
for k=1:(length(locs))
subplot(floor(length(locs)),1,k) ; plot(((locs(k)-1)*n_overlap+1 : (locs(k)-1)*n_overlap+frame_length-1),signal((locs(k)-1)*n_overlap+1 : (locs(k)-1)*n_overlap+frame_length-1)); ...
grid ; xlabel('sample index ') ; ylabel('Amplitude')
end
%figure(2) ; spectrogram(signal,hamming(frame_length),n_overlap,nfft,Fs,'yaxis');

回答(0 个)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by