i want to use LSTM based audio network to work with Live audio
3 次查看(过去 30 天)
显示 更早的评论
Hello Matlab team,
I am using this example to work with my audio data set https://www.mathworks.com/matlabcentral/fileexchange/74611-fault-detection-using-deep-learning-classification#examples_tab dataset is trained but I want to make the application live with PC, forexample I have a mic and make an application to use my trained model to predict the output.
Can you guide me or help me with that?
Regards,
Arslan Munaim
0 个评论
回答(2 个)
jibrahim
2022-7-27
Hi Arslan,
There is a function in that repo (streamingClassifier) that should get the job done in conjunction with an audio device reader:
% Create a microphone object
adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
% These statistic value should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
5 个评论
jibrahim
2022-8-2
Hi Arslan,
Since you trained the network with a sample rate of 16e3, you will have to perform sample-rate conversion from 44100 kHz to 16 kHz. This code is a possible implementation, where you essentially feed the network frames of length 512 sampled at 16 kHz, just like the original code:
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,...
Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D; % get as close to desired frame size
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
while buff.NumUnreadSamples >= 512
frame = read(buff,512);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
Note that you can also potentially feed the network longer frames. That should also work, and is probably more efficient, as the network will run faster if you give it a long input (as opposed to multiple short ones):
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
N = buff.NumUnreadSamples;
L = floor(N/512);
if L>0
frame = read(buff,512*L);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
If you can't change the frame size on the microphone, then you can handle that using another buffer, for example:
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=22000);
buffSRC = dsp.AsyncBuffer;
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
write(buffSRC,frame);
frame = read(buffSRC,frameLength);
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
N = buff.NumUnreadSamples;
L = floor(N/512);
if L>0
frame = read(buff,512*L);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
jibrahim
2022-8-9
Hi Arslan,
audioDeviceReader supports multi-mic devices. Use the ChannelMappingSource and ChannelMapping properties to map between device input channels and the output data.
This network was trained on mono data, so, to adapt it to multi-channel data, you either have to retrain your network for multi-channel data, or somehow combine your input channels into one channel (by a weighted sum, or selecting a particular channel, etc) and proceed like above.
23 个评论
jibrahim
2022-8-20
OK, this helps. You will need other hardware (one device, multiple mics) for the system to recognize it. You could also give the UDP idea a shot, see how viable that is.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!