Why are 8 STFT vectors used for the predictor input, in the "Denoise Speech Using Deep Learning Networks" example ?

1 次查看(过去 30 天)
In the MATLAB example of denoising speech with deep learning, I have a hard time in grasping why they used 8 STFT segments for their predictor input.
it's been stated and underlined in this section;
Please does anyone get it?

回答(1 个)

Sahil Jain
Sahil Jain 2021-9-1
Hi Daniel. The example states "The predictor input consists of 8 consecutive noisy STFT vectors, so that each STFT output estimate is computed based on the current noisy STFT and the 7 previous noisy STFT vectors". This may have been done because the authors of this approach believe that taking into account the noisy STFT vectors of the current segment and the noisy STFT vectors of the previous 7 segments would lead to better performance. I would suggest going through the research articles mentioned in the references at the end of the example to further understand the motivation for doing this. Also, you can try training the network using only the current segment as input and see how it performs in comparison to using 8 segments.

类别

Help CenterFile Exchange 中查找有关 Measurements and Feature Extraction 的更多信息

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by