主要内容

Pruning

Prune network filters using first-order Taylor approximation; reduce number of learnable parameters

Prune filters from convolution layers by using first-order Taylor approximation. You can then generate C/C++ or CUDA® code from the pruned network.

For a detailed overview of the compression techniques available in Deep Learning Toolbox™ Model Compression Library, see Reduce Memory Footprint of Deep Neural Networks.

Simplified illustration of pruning. On the left is a sketch of a neural network with three layers that consist of four, three, and four neurons, respectively. All neurons are connected to all other neurons. An arrow points to a second sketch on the right that shows the same network, but one neuron has been cut out from the middle layer, and two neurons have been cut out from the final layer.

Functions

taylorPrunableNetworkNeural network suitable for compression using Taylor pruning (Since R2022a)
forwardCompute deep learning network output for training
predictCompute deep learning network output for inference
updatePrunablesRemove filters from prunable layers based on importance scores (Since R2022a)
updateScoreCompute and accumulate Taylor-based importance scores for pruning (Since R2022a)

Topics

Featured Examples