Convolution filter details¶

Implementation Details¶

The convolution filter efficiently computes the convolution of two signals. The efficiency is achieved by employing the FFT and the circular convolution theorem. The algorithm is a variant of the overlap-add method. It works on a fixed block size \(B\) for arbitrarily long input signals. Thus, the convolution of a streaming input signal with a long FIR filter \(h[n]\) (where the length of \(h[n]\) may exceed the block size \(B\)) is computed with a fixed complexity \(O(B \log B)\).

More formally, the convolution filter computes \(y[n] = (x * h)[n]\) by partitioning the input \(x\) and filter \(h\) into blocks and applies the overlap-add method. Let \(x[n]\) be an input signal of arbitrary length. Often, \(x[n]\) is a streaming input with unknown length. Let \(h[n]\) be an FIR filter with \(M\) taps. The convolution filter works on a fixed block size \(B=2^b\).

First, the input and filter are windowed and shifted to the origin to give the \(k\)-th block input \(x_k[n] = x[n + kB] , n=\{0,1,\ldots,B-1\},\forall k\in\mathbb{Z}\) and \(j\)-th block filter \(h_j[n] = h[n + jB] ,n=\{0,1,\ldots,B-1\},j=\{0,1,\ldots,\lfloor M/B \rfloor\}\). The convolution \(y_{k,j}[n] = (x_k * h_j)[n]\) is efficiently computed with length \(2B\) FFTs as

\[ y_{k,j}[n] = \mathrm{IFFT}(\mathrm{FFT}(x_k[n])\cdot\mathrm{FFT}(h_j[n])) . \]

The overlap-add method sums the "overlap" from the previous block with the current block. To complete the \(k\)-th block, the contribution of all blocks of the filter are summed together to give

\[ y_{k}[n] = \sum_j y_{k-j,j}[n] . \]

The final convolution is then the sum of the shifted blocks

\[ y[n] = \sum_k y_{k}[n - kB] . \]

Note that \(y_k[n]\) is of length \(2B\) so its second half overlaps and adds into the first half of the \(y_{k+1}[n]\) block.

Maximum efficiency criterion¶

To avoid excess computation or maximize throughput, the convolution filter should be given input samples in multiples of the block size \(B\). Otherwise, the FFT of a block is computed twice as many times as would be necessary and hence throughput is reduced.