Hi Matthew Strunks
It is my understanding that you want to convolve your audio signal and FIR filter. Since convolution in time domain is equal to multiplication in frequency domain, you can try first converting both into frequency domain and multiply.
For converting audio signal to frequency domain "stft" (short time fourier transform) can be used. Since stft applies fft on windowed audio signal, characteristiics of audio signal is preserved even if the length of audio signal is large.
Feel free to experiment with window size, so that the result is same as you expected.