How to best modify FFT bin amplitudes before IFFT (DFT, windowing)?
19 次查看(过去 30 天)
显示 更早的评论
I wish to do the following:
Read a mono 44.1kHz audio file.
Chop this audio in short overlapping (windowed?) segments.
Do FFT on these segments.
Read best as possible the amplitudes of the frequency bins.
Modify some of the amplitudes of some of these frequency bins (based on an algorithm I wrote).
With IFFT reconstruct the audio segments with these modified amplitudes of some of these frequency bins.
Stich together these audio segments to get an audio file which has the modifies amplitudes at certain frequencies at certain points in time with minimal side effects.
Now I'm mostly just beginning with Matlab and am looking for any relevant examples from which I can learn on how to do the above.
Also, some things are not yet clear to me regarding windowing and FFT.
For windowing. Am I correct in thinking that for the above example I can best window and overlap the short segments in such a way that by simply adding the windowed overlapping segments I get the original audio again? So for instance if I use triangular windowing with 50% overlap on both sides, that I will get the original audio back once I stitch these segments together again? Are there other windows that will work in this way? (for instance Hann?) Or am I altogether thinking wrong on how to best use windowing for what I want to do?
For FFT. I understand that the first half of the resulting frequency bins are the bins with the relevant amplitudes (for FFT length of 512, bins 0 to 255 represent the relevant frequencies and contain their amplitudes, bin 256 contains the nyquist if I understood correctly). The second half of the bins (257 to 512), can I just ignore those when modifying the amplitude of the first half? For instance if I have a 1kHz sine wave, do the FFT, modify the amplitude of the bin that contains the 1kHz tone by dividing the amplitude in half, then do an IFFT. Will the endresult be that 1kHz sine reduced in amplitude by 6dB or am I missing something?
Many thanks for any help / pointers!
0 个评论
采纳的回答
William Rose
2021-9-20
You say "For FFT. I understand that the first half of the resulting frequency bins are the bins with the relevant amplitudes (for FFT length of 512, bins 0 to 255 represent the relevant frequencies and contain their amplitudes, bin 256 contains the nyquist if I understood correctly)."
That is not correct. For the FFT of a 512 point long segment, bin 0 is the scaled mean value of the signal. Its imaginary part will always be zero if the original signal is real. Bins 1-255 are the complex numbers representing half of the FFT. Let's call it the bottom half. We could also call it the positive frequency part of the FFT. Bin 256 contains the scaled amplitude of the component sinusoid at the Nyquist freuency (). Its imaginary part will always be = 0, for any FFT with an even number of samples. Bins 257-511 are the other half ("top half", or negative frequency part) of the FFT. If the original signal is real, and they are, then the top half values will be the complex conjugates of the values in bins 1-255, where bin 257=conj(bin255), bin 258=conj(bin(254), ..., bin 511=conj(bin 1). Whtavever you do on the "low half" you must also do to the corresponding element on the "top half". Before you do the inverse FFT, be sure that the top half of the modified FFT is the complex conjugate of the flipped-around bottom half. If that is not true, then you will get complex numbers for the inverse FFT, and that indicates an error.
The other part of your question is: May I segment the signal, do FFTs, manipulate the FFTs, invert the manipulated FFTs, and paste the results back together, to get a signal whose frequencies have been "shaped", as if with a grpahic equalizer? The answer is you may, but you will probably end up with glitches at the segment boundaries. Initially, the signal is smooth across the segment boundaries. If you do an FFT and inverse FFT of each segment, without mean or trend removal, and without any frequency adjustments, you can paste the inverse FFT segments together and get back the original signal exactly. But if you do mean or trend removal or other adjustment of particular frequencies, then the pasted-together signal will have glitches, or discontinuities, at the segment boundaries. This is true for bothe overlapping and non-overlapping segmentation.
Another way of understanding the issue is that the sampling of the signal in the frequency domain is different with segmented signals than with the original signal. You lose samples of the "in-between" frequencies, including the lowest frequencies. Example: Suppose the original signal is sampled at Fs=1000 Hz, for N=1000 samples. Then the frequencies of the FFT are 0, 0.001, 0.002, ..., 0.498, 0.499, 0.500 Hz. Now I divide it into 10 segments of duration Nseg=100 points each. The frequencies of the FFT of each segment are 0, 0.010, 0.020, ..., 0.480, 0.490, 0.500 Hz.
14 个评论
William Rose
2021-9-24
If fs=sampling rate in Hz, and N=number of samples in signal x(i), and y=fft(x), then y is a vector of complex numbers with N elements. The vector of frequencies corresponing to the elements of y is
f=fs*(0:N-1)/N;
About half the frequencies in vector f are higher than the Nyquist frequency (). Those are the "top half" frequencies of the fft. An alternate name for Nyquist frequency is "folding frequency", since the spectrum above is the folded-over copy the spectrum from 0 to .
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Measurements and Spatial Audio 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!