CREPE Preprocess

Preprocess audio for CREPE deep pitch estimation

Since R2023a

Libraries:
Audio Toolbox / Deep Learning

Description

The CREPE Preprocess block generates frames from the input audio signal that you can feed to a CREPE pretrained network or to a network that accepts the same inputs as CREPE.

Examples

expand all

Estimate Pitch Using CREPE Blocks

This example uses:

Open Model

This example shows how to use the CREPE blocks to combine preprocessing, network inference, and postprocessing and obtain pitch estimations from an audio signal. See Estimate Pitch Using Deep Pitch Estimator Block for an example that uses the Deep Pitch Estimator block to perform the same task.

Adjust the parameters of the blocks to speed up computation and see the pitch estimations in real time as the audio plays.

Set the Overlap percentage (%) of the CREPE Preprocess block to 50. With a lower overlap percentage, the system processes frames less frequently.
Set the Number of output frames of the CREPE Preprocess block to 5. This causes the CREPE Preprocess block to buffer audio frames and pass them to the CREPE block in batches. Passing batches to the CREPE block improves computational efficiency by allowing it to process multiple frames in parallel. However, it also increases latency because the system outputs pitch estimations in batches instead of one at a time.
Set the Model capacity of the CREPE block to Large. This model has fewer parameters than the full-size model, leading to faster computation at the cost of slightly lower accuracy.

Run the model to listen to a singing voice and view the estimated pitch in real time.

Ports

Input

expand all

Port_1 — Audio input
vector

Audio input, specified as a one-channel signal (vector). If Sample rate of input signal (Hz) is 16e3, there are no restrictions on the input frame length. If Sample rate of input signal (Hz) is different from 16e3, then the input frame length must be a multiple of the decimation factor of the resampling operation that the block performs. If the input frame length does not satisfy this condition, the block generates an error message with information on the decimation factor.

Data Types: single | double

Output

expand all

Port_1 — Preprocessed audio frames
4-D array

Preprocessed audio frames for the CREPE neural network, returned as a 1024-by-1-by-1-by-N array, where N is the number of generated frames specified by Number of output frames.

Data Types: single

Parameters

expand all

Sample rate of input signal (Hz) — Sample rate of input signal in Hz
`16e3` (default) | positive scalar

Sample rate of the input signal in Hz, specified as a positive scalar.

Overlap percentage (%) — Overlap percentage between consecutive frames
`85` (default) | [0, 100)

Specify the overlap percentage between consecutive frames as a scalar in the range [0, 100).

Number of output frames — Number of generated frames
`1` (default) | positive integer

Number of generated frames in the output, specified as a positive integer.

Block Characteristics

Data Types	`double` \| `single`
Direct Feedthrough	`no`
Multidimensional Signals	`no`
Variable-Size Signals	`no`
Zero-Crossing Detection	`no`

References

[1] Kim, Jong Wook, Justin Salamon, Peter Li, and Juan Pablo Bello. “Crepe: A Convolutional Representation for Pitch Estimation.” In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 161–65. Calgary, AB: IEEE, 2018. https://doi.org/10.1109/ICASSP.2018.8461329.

CREPE Preprocess

Description

Examples

Estimate Pitch Using CREPE Blocks

Ports

Input

Port_1 — Audio input
vector

Output

Port_1 — Preprocessed audio frames
4-D array

Parameters

Sample rate of input signal (Hz) — Sample rate of input signal in Hz
`16e3` (default) | positive scalar

Overlap percentage (%) — Overlap percentage between consecutive frames
`85` (default) | [0, 100)

Number of output frames — Number of generated frames
`1` (default) | positive integer

Block Characteristics

References

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.

Version History

See Also

Blocks

Functions

CREPE Preprocess

Description

Examples

Estimate Pitch Using CREPE Blocks

Ports

Input

Port_1 — Audio input vector

Output

Port_1 — Preprocessed audio frames 4-D array

Parameters

Sample rate of input signal (Hz) — Sample rate of input signal in Hz 16e3 (default) | positive scalar

Overlap percentage (%) — Overlap percentage between consecutive frames 85 (default) | [0, 100)

Number of output frames — Number of generated frames 1 (default) | positive integer

Block Characteristics

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using Simulink® Coder™.

Version History

See Also

Blocks

Functions

Port_1 — Audio input
vector

Port_1 — Preprocessed audio frames
4-D array

Sample rate of input signal (Hz) — Sample rate of input signal in Hz
`16e3` (default) | positive scalar

Overlap percentage (%) — Overlap percentage between consecutive frames
`85` (default) | [0, 100)

Number of output frames — Number of generated frames
`1` (default) | positive integer

C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.