Can anyone explain me this code of mel spaced filter banks? i have to use this in my speaker recognition project.

21 次查看(过去 30 天)
function m = melfb(p, n, fs)
% MELFB Determine matrix for a mel-spaced filterbank %
% Inputs: p number of filters in filterbank
% n length of fft
% fs sample rate in Hz %
% Outputs: x a (sparse) matrix containing the filterbank amplitudes
% size(x) = [p, 1+floor(n/2)] %
f0 = 700 / fs;
fn2 = floor(n/2);
lr = log(1 + 0.5/f0) / (p+1);
% convert to fft bin numbers with 0 for DC term
bl = n * (f0 * (exp([0 1 p p+1] * lr) - 1));
b1 = floor(bl(1)) + 1;
b2 = ceil(bl(2));
b3 = floor(bl(3));
b4 = min(fn2, ceil(bl(4))) - 1;
pf = log(1 + (b1:b4)/n/f0) / lr;
fp = floor(pf);
pm = pf - fp;
r = [fp(b2:b4) 1+fp(1:b3)];
c = [b2:b4 1:b3] + 1;
v = 2 * [1-pm(b2:b4) pm(1:b3)];
m = sparse(r, c, v, p, 1+fn2);

回答(1 个)

Hari
Hari 2025-6-11,8:24
Hi,
I understand that you’re trying to use the melfb function in your speaker recognition project, and you're seeking an explanation of how the code generates mel filter banks from FFT bins.
I assume that you're familiar with basic signal processing and the concept of mel-scale filtering, but need clarity on how the code translates mel-scale logic into filter bank computation using MATLAB.
In order to understand how this code computes mel-spaced triangular filter banks, you can follow the below explanation:
Step 1: Set up mel frequency conversion parameters
The mel scale approximates human hearing and is calculated as:
mel(f) = log(1 + f/700)
The variable f0 = 700/fs and lr = log(1 + 0.5/f0)/(p+1) convert frequency to mel-space and divide the mel scale evenly.
Step 2: Determine filter bank edges in FFT bins
The code computes boundaries for p filters across the spectrum:
  • bl = n * (f0 * (exp([...]) - 1)) maps evenly spaced mel points back to linear frequency (in FFT bin numbers).
  • b1, b2, b3, and b4 define the valid range of bins contributing to the filter bank.
Step 3: Compute triangular weights for filters
  • pf converts bin indices to positions in mel space.
  • fp gives the lower bin of the triangle for each frequency.
  • pm gives the fractional part for linear interpolation.
  • The vectors r, c, and v form the row indices, column indices, and values of the sparse matrix.
Step 4: Construct sparse filter bank matrix
  • m = sparse(r, c, v, p, 1+fn2) builds a p × (n/2 + 1) matrix.
  • Each row in m is a triangular filter centered at a mel-scaled frequency, used to convert FFT magnitudes to mel-filtered energy.
This matrix m is used to transform the FFT magnitude spectrum into mel filterbank energies, essential in computing MFCCs for speaker recognition.
Refer to the documentation of:
Hope this helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by