Main Content

Discrete FIR Filter

Finite-impulse response filter

  • Discrete FIR Filter block

Libraries:
DSP HDL Toolbox / Filtering

Description

The Discrete FIR Filter block models finite-impulse response filter architectures optimized for HDL code generation. The block accepts scalar or frame-based input, supports multichannel input, and provides an option for programmable coefficients by using a parallel interface or a memory interface. The block provides a hardware-friendly interface with input and output control signals. To provide a cycle-accurate simulation of the generated HDL code, the block models architectural latency including pipeline registers and resource sharing.

The block provides three filter structures.

  • The direct form systolic architecture provides a fully parallel implementation that makes efficientinterleav use of Intel® and AMD® DSP blocks.

  • The direct form transposed architecture is a fully parallel implementation and is suitable for FPGA and ASIC applications.

  • The partly serial systolic architecture provides a configurable serial implementation that makes efficient use of FPGA DSP blocks.

For a filter implementation that matches multipliers, pipeline registers, and pre-adders to the DSP configuration of your FPGA vendor, specify your target device when you generate HDL code.

All single-channel filter structures remove multipliers for zero-valued coefficients, such as in half-band filters and Hilbert transforms. The block also provides an option to implement +/- 1 and power of 2 coefficients without a multiplier, and an option to implement all coefficients with CSD or factored-CSD logic. When you use scalar or multichannel input data, the filter shares multipliers for symmetric and antisymmetric coefficients. Frame-based filters do not implement symmetry optimization. Multichannel filters do not remove multipliers for zero-valued coefficients. Multichannel filters share resources between channels, even if the filter coefficients are different across the channels.

The latency between valid input data and the corresponding valid output data depends on the filter structure, serialization options, number of coefficients, and whether the coefficient values provide optimization opportunities. For details of structure and latency, see FIR Filter Architectures for FPGAs and ASICs.

Note

You can also generate HDL code for this hardware-optimized algorithm, without creating a Simulink® model, by using the DSP HDL IP Designer app. The app provides the same interface and configuration options as the Simulink block.

Examples

Ports

Input

expand all

Input data, specified as a scalar, column vector, or row vector of real or complex values. Use a column vector to increase throughput by processing samples in parallel.

You can use a row vector, [c1 c2 c3], to represent input samples for multiple channels on a single cycle, or you can provide scalar multichannel data with the channels interleaved: c1 data sample on cycle 1, c2 data sample on cycle 2, c3 data sample on cycle 3. The channels can have independent filter coefficients. (since R2023a)

Waveform of row vector and scalar multichannel data signals.

In R2023a and R2023b: you can use multichannel row-vector input only if there are at least as many invalid cycles between inputs as there are channels. When the input is a multichannel vector, the Filter structure must be set to Partly serial systolic, and Number of cycles must be equal to or greater than the number of channels. This time allows the block to implement a partly-serial architecture that shares resources between the channels.

Frame-based (column vector) input is not supported with multichannel coefficients. To implement a high-throughput multichannel filter, you can use a For Each block to implement a high throughput filter for each channel. This implementation cannot share resources between the channels.

The size of the row or column vector must be less than or equal to 64 elements. To implement a multichannel filter with more than 64 channels, you must use interleaved scalar input.

When the input data type is an integer type or a fixed-point type, the block uses fixed-point arithmetic for internal calculations and provides parameters on the Data Types tab to customize the data types. When the input data type is a floating-point type, the block uses that input floating-point type for internal calculations and the output data type.

The software supports double and single data types for simulation, but not for HDL code generation.

Data Types: fixed point | single | double | int8 | int16 | int32 | uint8 | uint16 | uint32
Complex Number Support: Yes

Control signal that indicates if the input data is valid. When valid is 1 (true), the block captures the values from the input data port. When valid is 0 (false), the block ignores the values from the input data port.

Data Types: Boolean

Filter coefficients, specified as a row vector of real or complex values. You can change the input coefficients at any time. When you use scalar input data, the size of the coefficient vector depends on the size and symmetry of the sample coefficients specified in the Coefficients prototype parameter. The prototype specifies a sample coefficient vector that is representative of the symmetry and zero-valued locations of the expected input coefficients. The block uses the prototype to optimize the filter by sharing multipliers for symmetric or antisymmetric coefficients, and by removing multipliers for zero-valued coefficients. Therefore, provide only the nonduplicate coefficients at the port. For example, if you set the Coefficients prototype parameter to a symmetric 14-tap filter, the block expects a vector of 7 values on the coeff input port. You must still provide zeros in the input coeff vector for the nonduplicate zero-valued coefficients.

When you use frame-based input data, the block does not optimize the filter for coefficient symmetry. The block still uses the Coefficients prototype to remove multipliers for zero-valued coefficients. At the coeff input port, specify a vector that is the same size as the prototype.

If the input data is a fixed-point type, the coeff values must also be of a fixed point type. If the input data is a floating-point data type, the coeff values must be of the same data type.

The software supports double and single data types for simulation, but not for HDL code generation.

Dependencies

To enable this port, set Coefficients source to Input port (Parallel interface).

Data Types: single | double | int8 | int16 | int32 | uint8 | uint16 | uint32 | fixed point

Since R2023a

Filter coefficients, specified as a real or complex scalar value to write to internal memory. To load a single coefficient value to internal memory, specify a coeff value with a corresponding address on the caddr port and an enable signal on the cwren port. You can change the input coefficients at any time.

Waveform that shows writing a set of coefficients to the filter by using the memory interface

While you write new coefficients into memory, the block ignores any input data, but still returns dataOut with validOut until it clears the filter pipeline. The block resumes accepting input the cycle after cdone is set to 1 (true).

Waveform that shows the filter stops processing input data while receiving new coefficients on the memory interface

The coefficient memory has the same number of addresses as the size of the Coefficients prototype parameter. The prototype specifies a sample coefficient vector that is representative of the symmetry and zero-valued locations of the expected input coefficients. When you use scalar input data, the block uses the prototype to optimize the filter by sharing multipliers for symmetric or antisymmetric coefficients, and by removing multipliers for zero-valued coefficients. You must write the entire set of coefficients to memory, including symmetric or zero-value coefficients. For example, if you set the Coefficients prototype parameter to a symmetric 14-tap filter, you must write 14 values to the memory interface.

When you use frame-based input data, the block does not optimize the filter for coefficient symmetry. The block still uses the Coefficients prototype parameter to remove multipliers for zero-valued coefficients. The coefficient memory has the same number of locations as the size of the prototype.

If the input data is a fixed-point type, the coeff values must also be of a fixed point type. If the input data is a floating-point data type, the coeff values must be of the same data type.

The software supports double and single data types for simulation, but not for HDL code generation.

Dependencies

To enable this port, set Coefficients source to Input port (Memory interface).

Data Types: single | double | int8 | int16 | int32 | uint8 | uint16 | uint32 | fixed point

Since R2023a

Specify the filter coefficient address as a scalar integer value represented as an unsigned fixed-point type with zero fractional bits. The block derives the size of this integer value, and the size of the internal memory, from the number of unique coefficients in the Coefficients prototype parameter value.

Dependencies

To enable this port, set Coefficients source to Input port (Memory interface).

Data Types: fixdt(0,N,0)

Since R2023a

Set this input to 1 (true) to write the value on the coeff port into the caddr location in internal memory.

Dependencies

To enable this port, set Coefficients source to Input port (Memory interface).

Data Types: Boolean

Since R2023a

Set this input to 1 (true) to indicate that the current port values write the final coefficient value to memory.

Dependencies

To enable this port, set Coefficients source to Input port (Memory interface).

Data Types: Boolean

Control signal that clears internal states. When reset is 1 (true), the block stops the current calculation and clears internal states. When reset is 0 (false) and the input valid is 1 (true), the block captures data for processing.

For more reset considerations, see the Reset Signal section on the Hardware Control Signals page.

Dependencies

To enable this port, on the Control Ports tab, select Enable reset input port.

Data Types: Boolean

Output

expand all

Filtered output data, returned as a scalar, column vector, or row vector of real or complex values. The dimensions of the output match the dimensions of the input. When the input data type is a floating-point type, the output data inherits the data type of the input data. When the input data type is an integer type or a fixed-point type, the Output parameter on the Data Types tab controls the output data type.

Data Types: fixed point | single | double
Complex Number Support: Yes

Control signal that indicates if the data from the output data port is valid. When valid is 1 (true), the block returns valid data from the output data port. When valid is 0 (false), the values from the output data port are not valid.

Data Types: Boolean

Control signal that indicates the block is ready for a new input data sample on the next cycle. When ready is 1 (true), you can specify the data and valid inputs for the next time step. When ready is 0 (false), the block ignores any input data in the next time step.

When using the partly serial architecture, the block processes one sample at a time. If your design waits for this block to return ready set to 0 (false) before setting the input valid to 0 (false), then one additional cycle of input data arrives at the port. The block stores this additional data while processing the current data, and does not set ready to 1 (true) until your model processes the additional input data.

Dependencies

To enable this port, set Filter structure to Partly serial systolic.

Data Types: Boolean

Parameters

expand all

Main

You can enter constant filter coefficients as a parameter, provide time-varying filter coefficients by using an input port, or provide time-varying coefficients by using a memory-style interface.

You cannot use programmable coefficients with multichannel data.

When you select Input port (Parallel interface), the coeff port appears on the block.

When you select Input port (Memory interface), a memory-style interface appears on the block. This interface includes the coeff, caddr, cwren, and cdone ports.

Selecting Input port (Parallel interface) or Input port (Memory interface) enables the Coefficients prototype parameter. Specify a prototype to enable the block to optimize the filter implementation according to the values of the coefficients.

When you use programmable coefficients with frame-based input, the block does not optimize the filter for coefficient symmetry. Also, the output after a change of coefficient values might not match the output in the scalar case exactly. This difference occurs because the subfilter calculations are performed at different times relative to the input coefficient values, compared with the scalar implementation.

Dependencies

Before R2023b: To use Input port (Parallel interface), set the Filter structure parameter to Direct form systolic or Direct form transposed.

Discrete FIR filter coefficients, specified as a row vector of real or complex values. You can specify multichannel coefficients with a K-by-L matrix of real or complex values, where K is the number of channels and L is the filter length. To enable symmetry optimization, the symmetry characteristics of all channels must align. For example, if one channel is even-symmetric, all channels must be even-symmetric.

You can also specify the coefficients as a workspace variable or as a call to a filter design function. When the input data type is a floating-point type, the block casts the coefficients to the same data type as the input. When the input data type is an integer type or a fixed-point type, you can set the data type of the coefficients on the Data Types tab.

Example: firpm(30,[0 0.1 0.2 0.5]*2,[1 1 0 0])

Dependencies

To enable this parameter, set Coefficients source to Property.

Prototype filter coefficients, specified as a vector of real or complex values. The prototype specifies a sample coefficient vector that is representative of the symmetry and zero-valued locations of the expected input coefficients. If all input coefficient vectors have the same symmetry and zero-valued coefficient locations, set Coefficients prototype to one of those vectors. The block uses the prototype to optimize the filter by sharing multipliers for symmetric or antisymmetric coefficients, and by removing multipliers for zero-valued coefficients.

When you use frame-based input data, the block does not optimize the filter for coefficient symmetry. The block still uses the Coefficients prototype parameter to remove multipliers for zero-valued coefficients.

Coefficient SourceInput SizeIf No Prototype
Input port (Parallel interface)

When you use scalar input data, coefficient optimizations affect the expected size of the vector on the coeff port. Provide only the nonduplicate coefficients at the port. For example, if you set the Coefficients prototype parameter to a symmetric 14-tap filter, the block shares one multiplier between each pair of duplicate coefficients, so the block expects a vector of 7 values on the coeff port. You must still provide zeros in the input coeff vector for the nonduplicate zero-valued coefficients.

When you use frame-based input data, specify a coeff vector that is the same size as the prototype.

If your coefficients are unknown or not expected to share symmetry or zero-valued locations, you can set Coefficients prototype to [].

Input port (Memory interface)

Write the same number of coefficient values as the size of the prototype.

Coefficients prototype cannot be empty. The block uses the prototype to determine the size of the coefficient memory. If your coefficients are unknown or not expected to share symmetry or zero-valued locations, set Coefficients prototype to a vector with the same length as your expected coefficients, which does not contain symmetry or zero values, for example [1:1:NumCoeffs].

Dependencies

To enable this parameter, set Coefficients source to Input port (Parallel interface) or Input port (Memory interface).

Specify the HDL filter architecture as one of these structures:

  • Direct form systolic — This architecture provides a fully parallel filter implementation that makes efficient use of Intel and AMD DSP blocks. For architecture details, see Fully Parallel Systolic Architecture. When you specify multichannel coefficients with this architecture (with interleaved input samples), the block interleaves the channel coefficients over a single parallel filter.

  • Direct form transposed — This architecture is a fully parallel implementation that is suitable for FPGA and ASIC applications. For architecture details, see Fully Parallel Transposed Architecture. When you specify multichannel coefficients with this architecture (with interleaved input samples), the block interleaves the channel coefficients over a single parallel filter.

  • Partly serial systolic — This architecture provides a serial filter implementation and options for tradeoffs between throughput and resource utilization. The architecture makes efficient use of Intel and AMD DSP blocks. The block implements a serial L-coefficient filter with M multipliers and requires input samples that are at least N cycles apart, such that L = N×M. You can specify either M or N. For this implementation, the block provides the output ready port which indicates when the block is ready for new input data. For architecture details, see Partly Serial Systolic Architecture (1 < N < L) and Fully Serial Systolic Architecture (N ≥ L). You cannot use frame-based input with the partly serial architecture.

    When you specify multichannel coefficients with a serial architecture, you must specify the serialization factor as the number of cycles between valid input samples.

    For multichannel input that is scalar and interleaved over the channels, the block implements these serial architectures:

    • When N < L: Partly serial filter with L/N multipliers.

    • When N >= L: Fully serial filter.

    For multichannel input that is a 1-by-K vector, where K is the number of channels, the block implements these serial architectures:

    • When N = 1: Filter bank of fully parallel filters.

    • When 1 < N < K: Filter bank of partly serial filters. (since R2024a)

    • When N = K: Fully parallel filter with channel coefficients interleaved.

    • When K < N < L×K: Partly serial filter with L×K/N multipliers.

    • When N >= L×K: Fully serial filter.

If any filter is symmetric, the architecture shares multipliers for matching coefficients, so effectively L becomes L/2. To enable the symmetry optimization for multichannel filters, the symmetry characteristics of all channels must align.

All single-channel implementations remove multipliers for zero-valued coefficients. Multichannel filters do not optimize for zero-valued coefficients. When you use scalar or multichannel input data, the filter shares multipliers for symmetric and antisymmetric coefficients. Frame-based filters do not implement symmetry optimization. Multichannel filters share resources between channels, even if the filter coefficients are different across the channels.

You can specify the rule that the block uses to serialize the filter as either:

  • Minimum number of cycles between valid input samples — Specify a requirement for input data timing using the Number of cycles parameter.

  • Maximum number of multipliers — Specify a requirement for resource usage using the Number of multipliers parameter. This option is not supported when you have multichannel coefficients.

For a filter with L coefficients, the block implements a serial filter with not more than M multipliers and requires input samples that are at least N cycles apart, such that L = N×M. The block might remove multipliers when it applies coefficient optimizations, so the actual M or N value of the filter implementation might be lower than the specified value.

If the filter is symmetric, the architecture shares multipliers for matching coefficients, so effectively L = L/2.

When you use complex input data and/or complex coefficients with a single-channel partly serial architecture, the block implements complex interleaving to share the multipliers over inactive input cycles. For complex input and complex coefficients, the block needs at least L×3 cycles to implement the filter with a single multiplier. For complex input with real coefficients or complex coefficients with real input, the block needs at least L×2 cycles to implement the filter with a single multiplier. (since R2023b)

Dependencies

To enable this parameter, set the Filter structure parameter to Partly serial systolic.

Serialization requirement for input timing, specified as a positive integer. This parameter represents N, the minimum number of cycles between valid input samples. In this case, the block calculates M = L/N. To implement a fully serial architecture, set Number of cycles to a value greater than the filter length, L, or to Inf. To implement a fully serial architecture for a multichannel filter with 1-by-K vector input, set Number of cycles to a value greater than L×K, where K is the number of channels.

To implement a fully serial architecture for a single channel filter with complex input and complex coefficients, set Number of cycles greater than L×3. If you have complex input with real coefficients or complex coefficients with real input, set Number of cycles greater than L×2.

If the filter is symmetric, the architecture shares multipliers for matching coefficients, so effectively L = L/2. To enable the symmetry optimization for multichannel filters, the symmetry characteristics of all channels must align.

The block might remove multipliers when it applies coefficient optimizations, so the actual M and N values of the filter can be lower than the value you specified.

Dependencies

To enable this parameter, set Filter structure to Partly serial systolic and set Specify serialization factor as to Minimum number of cycles between valid input samples.

Serialization requirement for resource usage, specified as a positive integer. This parameter represents M, the maximum number of multipliers in the filter implementation. In this case, the block calculates N = L/M. If the input data is complex, the block allocates floor(M/2) multipliers for the real part of the filter and floor(M/2) multipliers for the imaginary part of the filter. To implement a fully serial architecture, set Number of multipliers to 1.

If the filter is symmetric, the architecture shares multipliers for matching coefficients, so effectively L = L/2.

When you use complex input data and/or complex coefficients with a single-channel partly serial architecture, the block implements complex interleaving to share the multipliers over inactive input cycles. For complex input and complex coefficients, the block needs at least L×3 cycles to implement the filter with a single multiplier. For complex input with real coefficients or complex coefficients with real input, the block needs at least L×2 cycles to implement the filter with a single multiplier.

The block might remove multipliers when it applies coefficient optimizations, so the actual M and N values of the filter might be lower than the specified value.

Dependencies

To enable this parameter, set the Filter structure to Partly serial systolic, and set Specify serialization factor as to Maximum number of multipliers.

You cannot use this parameter when you specify multichannel coefficients. Use the Number of cycles parameter instead.

Data Types

Rounding mode for type-casting the output to the data type specified by the Output parameter. When the input data type is floating point, the block ignores this parameter. For more details, see Rounding Modes.

Overflow handling for type-casting the output to the data type specified by the Output parameter. When the input data type is floating point, the block ignores this parameter. For more details, see Overflow Handling.

When the input is a fixed-point or integer type, the block casts the filter coefficients using the rule or data type in this parameter. The quantization rounds to the nearest representable value and saturates on overflow. When the input data type is a floating-point type, the block ignores this parameter and all internal arithmetic uses the same data type as the input.

The recommended setting for this parameter is Inherit: Same word length as input.

The block returns a warning or error if:

  • The coefficients data type does not have enough fractional length to represent the coefficients accurately.

  • The coefficients data type is unsigned and the coefficients include negative values.

Dependencies

To enable this parameter, set Coefficients source to Property.

When the input is a fixed-point or integer type, the block casts the output of the filter using the rule or data type in this parameter. The quantization uses the settings of the Rounding mode and Overflow mode parameters. When the input data type is floating point, the block ignores this parameter and returns output in the same data type as the input.

The block increases the word length for full precision inside each filter tap and casts the final output to the specified type. The maximum final internal data type (WF) depends on the input data type (WI), the coefficient data type (WC), and the number of coefficients (L), and is given by

WF = WI + WC + ceil(log2(L)).

When you specify a fixed set of coefficients, the actual full-precision internal word length is usually smaller than WF, because the coefficient values limit the potential growth.

When you use programmable coefficients, the block cannot calculate the dynamic range, and the internal data type is always WF.

Control Ports

Select this check box to enable the reset input port. The reset signal implements a local synchronous reset of the data path registers.

For more reset considerations, see the Reset Signal section on the Hardware Control Signals page.

Select this check box to connect the generated HDL global reset signal to the data path registers. This parameter does not change the appearance of the block or modify simulation behavior in Simulink. When you clear this check box, the generated HDL global reset clears only the control path registers. The generated HDL global reset can be synchronous or asynchronous depending on the HDL Code Generation > Global Settings > Reset type parameter in the model Configuration Parameters.

For more reset considerations, see the Reset Signal section on the Hardware Control Signals page.

Implementation

Since R2023b

By default, the block implements coefficient multipliers using a hardware multiplier. Select CSD/Factored-CSD to replace coefficient multipliers with a CSD or factored-CSD implementation. A CSD or factored-CSD implementation uses shift and add operations rather than multipliers. When you select CSD, coefficients of +/- 1 and power of 2 are also implemented with shift logic.

The latency of the block does not change with multiplier implementation. Each multiplier has the same number of pipeline stages around it in either implementation

Dependencies

To enable this parameter, set the Filter structure parameter to Direct form transposed. Using CSD multipliers with systolic architecture is not supported because it can prevent efficient use of FPGA DSP blocks.

CSD implementations are not supported for multichannel or programmable filters.

Since R2023b

By default, the block implements special-value coefficient multipliers using a hardware multiplier. Clear this check box to replace special-value coefficient multipliers with a shift implementation.

Dependencies

To enable this parameter, set Filter structure to Direct form transposed, and set Coefficient multiplication to Multiplier, or set Filter structure to Direct form systolic.

CSD implementations are not supported for multichannel or programmable filters.

Algorithms

expand all

The filter architectures for the Discrete FIR Filter block are shared with other filter blocks and described in detail on the FIR Filter Architectures for FPGAs and ASICs page.

This flow chart shows the Discrete FIR Filter block architecture for multichannel coefficients, that is, when you set the Coefficients parameter to an K-by-L matrix.

If the filter is symmetric, the architecture shares multipliers for matching coefficients, so effectively L = L/2. To enable the symmetry optimization, the symmetry characteristics of all channels must align. For example, if one channel is even-symmetric, all channels must be even-symmetric.

The sections below show the hardware resources and synthesized clock speed for the Discrete FIR Filter block configured with each filter architecture.

Extended Capabilities

Version History

Introduced in R2017a

expand all