compressNetworkUsingTaylorPruning

Compress neural network using Taylor pruning

Since R2026a

collapse all in page

Syntax

netPruned = compressNetworkUsingTaylorPruning(net,data,lossFcn,options)

netPruned = compressNetworkUsingTaylorPruning(net,predictors,targets,lossFcn,options)

netPruned = compressNetworkUsingTaylorPruning(___,Name=Value)

[netPruned,info] = compressNetworkUsingTaylorPruning(___)

Description

Add-On Required: This feature requires the Deep Learning Toolbox Model Compression Library add-on.

The compressNetworkUsingTaylorPruning function reduces the number of learnable parameters in a neural network by pruning the least important filters in convolutional layers.

The compressNetworkUsingTaylorPruning function prunes a network iteratively by repeating these steps:

Compute the importance score of each prunable filter.
Prune the least important filters.
Fine-tune the pruned network.

Tip

If you cannot train your network using the trainnet function, then create a custom pruning loop by using the taylorPrunableNetwork function instead.

netPruned = compressNetworkUsingTaylorPruning(net,data,lossFcn,options) prunes a trained neural network net by using targets and predictors specified by data, and by using the training options options for fine-tuning.

netPruned = compressNetworkUsingTaylorPruning(net,predictors,targets,lossFcn,options) prunes the network by using the predictors predictors and targets targets.

example

netPruned = compressNetworkUsingTaylorPruning(___,Name=Value) specifies additional options using one or more name-value arguments. For example, compressNetworkUsingTaylorPruning(net,data,lossFcn,options,LearnablesReductionGoal=0.3) tries to remove 30% of learnable parameters.

[netPruned,info] = compressNetworkUsingTaylorPruning(___) also returns information about the pruning process, such as the number of learnables in each pruning iteration and the training loss in each fine-tuning iteration. You can use this syntax with any of the input argument combinations in the previous syntaxes.

Examples

collapse all

Compress Network Using Taylor Pruning

This example uses:

Open Live Script

This example shows how to compress a trained neural network using Taylor pruning.

Load Data

Load a pretrained network and the data and training options used to train it. To learn how this network was trained, see Train Sequence Classification Network for Road Damage Detection.

load("RoadDamageAnalysisNetwork.mat")
loadAndPreprocessDataForRoadDamageDetectionExample

Prune Network

Compress the network using Taylor pruning. Use the same training predictors and targets, loss function, and training options used to train the network.

[netPruned,info] = compressNetworkUsingTaylorPruning(netTrained,XTrain,TTrain,lossFcn,options)

Compressed network has 79.6% fewer learnable parameters.
Pruning compressed 5 layers: "conv1d_1","batchnorm_1","conv1d_2","batchnorm_2","fc_1"

netPruned = 
  dlnetwork with properties:

         Layers: [14×1 nnet.cnn.layer.Layer]
    Connections: [13×2 table]
     Learnables: [12×3 table]
          State: [4×3 table]
     InputNames: {'sequenceinput'}
    OutputNames: {'softmax'}
    Initialized: 1

  View summary with summary.

info = struct with fields:
       PruningHistory: [15×3 table]
    ValidationHistory: [530×5 table]
      TrainingHistory: [5160×7 table]
         PrunedLayers: [5×1 string]
           StopReason: "Maximum compression reached."
      ProgressMonitor: [1×1 deep.TrainingProgressMonitor]

The pruning progress plots shows that in this example, the function performs 14 pruning iterations. During each iteration, the software removes 5% of learnable parameters, until it reaches the maximum possible compression of 79.632%. At the beginning of each pruning iteration, the loss spikes and the accuracy drops, but both loss and accuracy recover during fine-tuning.

Input Arguments

collapse all

`net` — Neural network
`dlnetwork` object

Neural network, specified as a dlnetwork object.

The Taylor pruning algorithm prunes convolutional filters from convolution1dLayer and convolution2dLayer objects. Pruning convolutional filters can also reduce the number of learnable parameters in these downstream layers:

The compressNetworkUsingTaylorPruning function can compress layers contained inside a networkLayer object. The software compresses every supported layer inside the network layer.

To determine whether your network supports Taylor pruning, create a compression analysis report by opening the network in the Deep Network Designer app and clicking Analyze for Compression.

`data` — Training data containing predictors and targets
datastore | `minibatchqueue` object | table

Training data containing predictors and targets, specified as a datastore, minibatchqueue object, or table.

If you have the data used to train the input network, then you can use that data for pruning. You can also use a representative sample of the training data.

If you do not have the data used to train the input network, then specify the data in a form that the trainnet function supports.

To prune a network for transfer learning, use data that is representative of the inference distribution.

`predictors` — Training predictors
`dlarray` object | categorical array | numeric array | cell array of numeric arrays | cell array of `dlarray` objects

Training predictors, specified as a dlarray object, categorical array, numeric array, cell array of dlarray objects, or cell array of numeric arrays.

If you have the predictors used to train the input network, then you can use those predictors for pruning. You can also use a representative sample of the training predictors.

If you do not have the predictors used to train the input network, then specify the targets in a form that the trainnet function supports.

To prune a network for transfer learning, use predictors that are representative of the inference distribution.

`targets` — Training targets
`dlarray` object | categorical array | numeric array | cell array of numeric arrays | cell array of `dlarray` objects

Training targets, specified as a dlarray object, categorical array, numeric array, cell array of dlarray objects, or cell array of numeric arrays.

If you have the targets used to train the input network, then you can use those targets for pruning. You can also use a representative sample of the training targets.

If you do not have the targets used to train the input network, then specify the targets in a form that the trainnet function supports.

To prune a network for transfer learning, use targets that are representative of the inference distribution.

`lossFcn` — Loss function
`"crossentropy"` | `"index-crossentropy"` | `"binary-crossentropy"` | `"mse"` | `"mean-squared-error"` | `"l2loss"` | `"mae"` | `"mean-absolute-error"` | `"l1loss"` | `"huber"` | function handle | `AcceleratedFunction` object | `deep.DifferentiableFunction` object

Loss function to use for fine-tuning, specified as one of these values:

"crossentropy" — Cross-entropy loss for classification tasks, normalized by dividing by the number of non-channel elements of the network output.
"index-crossentropy" — Index cross-entropy loss for classification tasks, normalized by dividing by the number of elements in the targets. Use this option to save memory when there are many categorical classes.
"binary-crossentropy" — Binary cross-entropy loss for binary and multilabel classification tasks, normalized by dividing by the number of elements of the network output.
"mae", "mean-absolute-error", or "l1loss" — Mean absolute error for regression tasks, normalized by dividing by the number of elements of the network output.
"mse", "mean-squared-error", or "l2loss" — Mean squared error for regression tasks, normalized by dividing by the number of elements of the network output.
"huber" — Huber loss for regression tasks, normalized by dividing by the number of elements of the network output.
Function handle with the syntax loss = f(Y1,...,Yn,T1,...,Tm), where Y1,...,Yn are dlarray objects that correspond to the n network predictions and T1,...,Tm are dlarray objects that correspond to the m targets.
AcceleratedFunction object — To create an accelerated loss function object, use the dlaccelerate function specifying a function handle as input. The function handle must have the syntax described in the previous bullet.
deep.DifferentiableFunction object — Function object with custom backward function.

For best results, use the same loss function used to train the original network.

`options` — Fine-tuning options
`TrainingOptionsSGDM` | `TrainingOptionsRMSProp` | `TrainingOptionsADAM` | `TrainingOptionsLBFGS` | `TrainingOptionsLM`

Fine-tuning options, specified as a TrainingOptionsSGDM, TrainingOptionsRMSProp, TrainingOptionsADAM, TrainingOptionsLBFGS, or TrainingOptionsLM object returned by the trainingOptions function.

The compressNetworkUsingTaylorPruning function overrides these options:

Fine-tuning option	Override value	Notes
`ResetInputNormalization`	`false`
`Plots`	`"none"`	To add a pruning progress plot, specify the `Plots` name-value argument.
`Verbose`	`false`	To add verbose output, specify the `VerbosityLevel` name-value argument.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: netPruned = compressNetworkUsingTaylorPruning(net,data,lossFcn,options,LearnablesReductionGoal=0.3) tries to remove 30% of learnable parameters.

`LearnablesReductionGoal` — Target proportion of network learnable parameters to remove
1 (default) | numerical scalar between 0 and 1

Target proportion of the network learnable parameters to remove, specified as a scalar between 0 and 1.

If LearnablesReductionGoal is greater than the maximum possible reduction in learnables, then the function removes the maximum possible proportion of learnables. To determine the maximum reduction in learnables, open the input network in the Deep Network Designer app, and then select Analyze for Compression.

If LearnablesReductionGoal is smaller than the maximum possible reduction in learnables, then the function removes at least the proportion of learnables specified by LearnablesReductionGoal. The function removes learnables in sets of entire convolutional filters, so the final reduction of learnables can be greater than LearnablesReductionGoal.

Data Types: single | double

`ValidationThreshold` — Validation metric threshold at which to stop pruning
[] (default) | numeric scalar

Validation metric threshold at which to stop pruning, specified as [] or as a numeric scalar.

Specify this option to stop pruning early if the quality of the network predictions deteriorates too much.

The software stops pruning if the validation metric exceeds the validation metric threshold at the end of fine-tuning. If pruning stops early, the compressNetworkUsingTaylorPruning function returns the last pruned network that does not exceed the validation metric threshold.

By default, the validation metric is the loss. To specify a different validation metric:

Specify the Metrics property of the training options options.
Specify the ObjectiveMetricName property of the training options options.

You can specify ValidationThreshold only if the training options options contains validation data. The value of the ValidationFrequency property of the training options must be small enough that validation can occur before the end of fine-tuning.

For an example showing how to use a validation threshold to stop pruning early, see Prune Neural Network with Accuracy Requirement.

`LearnablesReductionIncrement` — Minimum proportion of learnable parameters to remove in each pruning iteration
0.05 (default) | numeric scalar between 0 and 1

Minimum proportion of learnable parameters to remove in each pruning iteration, specified as a scalar between 0 and 1.

During each pruning iteration, the function removes the least important parameters before fine-tuning the network and reevaluating the importance scores. Removing a large proportion of learnable parameters per iteration results in faster pruning. Removing a small proportion of learnable parameters per iteration results in higher-quality network predictions.

For example, if LearnablesReductionIncrement is 0.05, then it takes up to 1/0.05=20 pruning iterations to remove the maximum possible number of learnable parameters from the network. Each pruning iteration includes fine-tuning, therefore in this example, the pruning process can include up to 20*options.MaxEpochs training epochs.

Data Types: single | double

`NumImportanceScoreIterations` — Number of mini-batch updates to use for calculating importance scores
`"auto"` (default) | positive integer

Number of mini-batch updates to use for calculating the importance scores, specified as "auto" or as a positive integer.

Taylor pruning relies on data to estimate the importance score of each learnable parameter. Typically, it is enough to use a subset of the dataset. To specify the size of the dataset to use for estimating the importance scores, specify the number of mini-batch updates. For example, if you have 200 datapoints (that is, if the batch dimension of your data is equal to 200), and if the MiniBatchSize property of the training options options is set to 25, then to use half of your dataset, specify NumImportanceScoreIterations as 4.

For faster pruning, use fewer mini-batch updates. For more accurate importance scores, use more mini-batch updates.

If NumImportanceScoreIterations is "auto", then the software uses 100 mini-batch updates or the entire dataset, whichever is smaller.

If NumImportanceScoreIterations is greater than the number of mini-batches in the dataset, then the software uses the entire dataset once.

If NumImportanceScoreIterations is smaller than the number of mini-batches in the dataset, then for best results, set the Shuffle property of the training options options to "every-epoch".

`LayerNames` — Names of layers to compress
string array | cell array of character vectors | character vector

Names of layers to compress, specified as a string array, cell array of character vectors, or a character vector containing a single layer name.

By default, the software compresses all the layers in the network that support Taylor pruning.

If you specify the names of layers to compress, the software compresses those layers as well as any upstream or downstream layers that they affect.

To compress a specific nested layer inside a network layer, specify the name of the network layer and the name of the nested layer separated by a forward slash "/". For example, the path to a layer named "nestedLayerName" in a network layer named "networkLayerName" is "networkLayerName/nestedLayerName". If there are multiple levels of nested layers, then specify the path using the form "networkLayerName1/.../networkLayerNameN/nestedLayerName".

Data Types: string | cell

`PruningCheckpointPath` — Path for saving checkpoint neural networks
`''` (default) | string scalar | character vector

Path for saving the checkpoint neural networks, specified as a string scalar or character vector.

The software saves checkpoint neural networks at the end of retraining every PruningCheckpointFrequency pruning iterations. Saving checkpoint neural networks lets you experiment with different levels of compression.

If you do not specify a path (that is, you use the default ''), then the software does not save any checkpoint neural networks.
If you specify a path, then the software saves each checkpoint neural network to a separate MAT-file on the path.
If the folder does not exist, then you must first create it before specifying the path for saving the checkpoint neural networks. If the path you specify does not exist, then the software throws an error.

Data Types: char | string

`PruningCheckpointFrequency` — Frequency of saving checkpoint neural networks
1 (default) | positive integer

Frequency of saving checkpoint neural networks, specified as a positive integer.

The software saves checkpoint neural networks at the end of retraining every PruningCheckpointFrequency pruning iterations.

This option only has an effect when PruningCheckpointPath is nonempty.

`VerbosityLevel` — Verbosity level
`"summary"` (default) | `"pruning"` | `"training"` | `"off"`

Verbosity level, specified as one of these values:

"summary" — Display a summary of the pruning algorithm.
"pruning" — Display information about the pruning iterations.
"training" — Display information about the pruning and fine-tuning iterations. To change the number of fine-tuning iterations to display, set the VerboseFrequency property of the training options options.
"off" — Do not display information.

`Plots` — Plots to display during pruning
`"pruning-progress"` (default) | `"none"`

Plots to display during pruning, specified as one of these values:

"none" — Do not display plots during pruning.
"pruning-progress" — Plot the number of learnable parameters and the fine-tuning progress during pruning.

To programmatically open the pruning progress plot after training, set the Visible property of the training progress monitor object info.ProgressMonitor to true. To close the plot, set the Visible property to false

To switch the y-axis scale to logarithmic, use the axes toolbar. Training plot axes toolbar with log scale enabled and the tooltip "Log scale y-axis".

Output Arguments

collapse all

`netPruned` — Pruned network
`dlnetwork` object

Pruned network, returned as a dlnetwork object.

`info` — Pruning information
structure

Pruning information, returned as a structure with these fields:

PruningHistory — Information about the pruning iterations
ValidationHistory — Information about the validation iterations
TrainingHistory — Information about the training iterations
PrunedLayers — Names of the pruned layers
StopReason — Reason why pruning stopped
ProgressMonitor — Pruning progress monitor, specified as a deep.TrainingProgressMonitor object
To programmatically open and close the pruning progress plot after training, set the Visible property of the training progress monitor object info.ProgressMonitor to true or false, respectively.

More About

collapse all

Network Architecture

For some network architectures, data dependency between the layers that support pruning and other layers in the network can prevent pruning of filters. These example architectures exhibit this behavior:

The network has a convolution2dLayer, a groupNormalizationLayer, and another convolution2dLayer connected in sequence. The presence of the group normalization layer prevents pruning of filters for the first convolution layer, because pruning would change the shape of the input channels of the group normalization layer.
The network has a convolution2dLayer connected in sequence to a softmaxLayer at the end of the network. This architecture prevents pruning of filters for the convolution layer because pruning would change the output size of the network.

Use the Deep Network Designer app to analyze the impact of your network architecture on pruning. Open your network in Deep Network Designer and then click Analyze for Compression.

Tips

To speed up pruning, increase LearnablesReductionIncrement, reduce NumImportanceScoreIterations, or reduce the MaxEpochs property of the fine-tuning options options.
You can also stop fine-tuning early based on custom criteria by specifying the OutputFcn property of the fine-tuning options options. To learn how to use an output function for early stopping, see Custom Stopping Criteria for Deep Learning Training.
The compressNetworkUsingTaylorPruning function retrains the pruned network during each fine-tuning iteration. If you speed up the training process, then the overall pruning process will speed up as well. To learn how to speed up neural network training, see Speed Up Deep Neural Network Training.
To improve the predictive capability of the pruned network, reduce LearnablesReductionIncrement, increase NumImportanceScoreIterations, or increase MaxEpochs.

Algorithms

Pruning a neural network means removing the least important parameters to reduce the size of the network while preserving the quality of its predictions.

You can measure the importance of a set of parameters by the change in loss after you remove the parameters from the network. If the loss changes significantly, then the parameters are important. If the loss does not change significantly, then the parameters are not important and can be pruned.

Neural networks typically contain too many parameters for you to calculate the change in loss for all possible combinations of parameters. In that case, follow these steps to apply an iterative workflow instead.

Use an approximation to find and remove the least important parameter or a specified number of the least important parameters. For example, if you approximate the parameters to be independent, then you can measure the change in loss after removing each parameter by itself.
Fine-tune the new, smaller network by retraining it for several iterations.
Repeat steps 1 and 2 until you reach your compression goal.

To perform the approximation in step 1, calculate the Taylor expansion of the loss as a function of the individual network parameters. This method is called Taylor pruning.

For some types of layers, including convolutional layers, removing a parameter is equivalent to setting it to zero. In this case, the change in loss resulting from pruning a parameter θ can be expressed as

$| Δ loss (X, θ) | = | loss (X, θ = 0) - loss (X, θ) | .$

X is the training data of your network.

Calculate the Taylor expansion of the loss as a function of the parameter θ to first order using

$loss (X, θ) = loss (X, θ = 0) + \frac{δ loss}{δ θ} θ .$

Then, you can express the change of loss as a function of the gradient of the loss with respect to the parameter θ using

$| Δ loss (X, θ) | = | \frac{δ loss}{δ θ} θ | .$

References

[1] Molchanov, Pavlo, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. "Pruning Convolutional Neural Networks for Resource Efficient Inference." arXiv, June 8, 2017. https://arxiv.org/abs/1611.06440.

Version History

Introduced in R2026a

compressNetworkUsingTaylorPruning

Syntax

Description

Examples

Compress Network Using Taylor Pruning

Input Arguments

`net` — Neural network
`dlnetwork` object

`data` — Training data containing predictors and targets
datastore | `minibatchqueue` object | table

`predictors` — Training predictors
`dlarray` object | categorical array | numeric array | cell array of numeric arrays | cell array of `dlarray` objects

`targets` — Training targets
`dlarray` object | categorical array | numeric array | cell array of numeric arrays | cell array of `dlarray` objects

`lossFcn` — Loss function
`"crossentropy"` | `"index-crossentropy"` | `"binary-crossentropy"` | `"mse"` | `"mean-squared-error"` | `"l2loss"` | `"mae"` | `"mean-absolute-error"` | `"l1loss"` | `"huber"` | function handle | `AcceleratedFunction` object | `deep.DifferentiableFunction` object

`options` — Fine-tuning options
`TrainingOptionsSGDM` | `TrainingOptionsRMSProp` | `TrainingOptionsADAM` | `TrainingOptionsLBFGS` | `TrainingOptionsLM`

Name-Value Arguments

`LearnablesReductionGoal` — Target proportion of network learnable parameters to remove
1 (default) | numerical scalar between 0 and 1

`ValidationThreshold` — Validation metric threshold at which to stop pruning
[] (default) | numeric scalar

`LearnablesReductionIncrement` — Minimum proportion of learnable parameters to remove in each pruning iteration
0.05 (default) | numeric scalar between 0 and 1

`NumImportanceScoreIterations` — Number of mini-batch updates to use for calculating importance scores
`"auto"` (default) | positive integer

`LayerNames` — Names of layers to compress
string array | cell array of character vectors | character vector

`PruningCheckpointPath` — Path for saving checkpoint neural networks
`''` (default) | string scalar | character vector

`PruningCheckpointFrequency` — Frequency of saving checkpoint neural networks
1 (default) | positive integer

`VerbosityLevel` — Verbosity level
`"summary"` (default) | `"pruning"` | `"training"` | `"off"`

`Plots` — Plots to display during pruning
`"pruning-progress"` (default) | `"none"`

Output Arguments

`netPruned` — Pruned network
`dlnetwork` object

`info` — Pruning information
structure

More About

Network Architecture

Tips

Algorithms

References

Version History

See Also

Topics

compressNetworkUsingTaylorPruning

Syntax

Description

Examples

Compress Network Using Taylor Pruning

Input Arguments

net — Neural network dlnetwork object

data — Training data containing predictors and targets datastore | minibatchqueue object | table

predictors — Training predictors dlarray object | categorical array | numeric array | cell array of numeric arrays | cell array of dlarray objects

targets — Training targets dlarray object | categorical array | numeric array | cell array of numeric arrays | cell array of dlarray objects

lossFcn — Loss function "crossentropy" | "index-crossentropy" | "binary-crossentropy" | "mse" | "mean-squared-error" | "l2loss" | "mae" | "mean-absolute-error" | "l1loss" | "huber" | function handle | AcceleratedFunction object | deep.DifferentiableFunction object

options — Fine-tuning options TrainingOptionsSGDM | TrainingOptionsRMSProp | TrainingOptionsADAM | TrainingOptionsLBFGS | TrainingOptionsLM

Name-Value Arguments

LearnablesReductionGoal — Target proportion of network learnable parameters to remove 1 (default) | numerical scalar between 0 and 1

ValidationThreshold — Validation metric threshold at which to stop pruning [] (default) | numeric scalar

LearnablesReductionIncrement — Minimum proportion of learnable parameters to remove in each pruning iteration 0.05 (default) | numeric scalar between 0 and 1

NumImportanceScoreIterations — Number of mini-batch updates to use for calculating importance scores "auto" (default) | positive integer

LayerNames — Names of layers to compress string array | cell array of character vectors | character vector

PruningCheckpointPath — Path for saving checkpoint neural networks '' (default) | string scalar | character vector

PruningCheckpointFrequency — Frequency of saving checkpoint neural networks 1 (default) | positive integer

VerbosityLevel — Verbosity level "summary" (default) | "pruning" | "training" | "off"

Plots — Plots to display during pruning "pruning-progress" (default) | "none"

Output Arguments

netPruned — Pruned network dlnetwork object

info — Pruning information structure

More About

Network Architecture

Tips

Algorithms

References

Version History

See Also

Topics

`net` — Neural network
`dlnetwork` object

`data` — Training data containing predictors and targets
datastore | `minibatchqueue` object | table

`predictors` — Training predictors
`dlarray` object | categorical array | numeric array | cell array of numeric arrays | cell array of `dlarray` objects

`targets` — Training targets
`dlarray` object | categorical array | numeric array | cell array of numeric arrays | cell array of `dlarray` objects

`lossFcn` — Loss function
`"crossentropy"` | `"index-crossentropy"` | `"binary-crossentropy"` | `"mse"` | `"mean-squared-error"` | `"l2loss"` | `"mae"` | `"mean-absolute-error"` | `"l1loss"` | `"huber"` | function handle | `AcceleratedFunction` object | `deep.DifferentiableFunction` object

`options` — Fine-tuning options
`TrainingOptionsSGDM` | `TrainingOptionsRMSProp` | `TrainingOptionsADAM` | `TrainingOptionsLBFGS` | `TrainingOptionsLM`

`LearnablesReductionGoal` — Target proportion of network learnable parameters to remove
1 (default) | numerical scalar between 0 and 1

`ValidationThreshold` — Validation metric threshold at which to stop pruning
[] (default) | numeric scalar

`LearnablesReductionIncrement` — Minimum proportion of learnable parameters to remove in each pruning iteration
0.05 (default) | numeric scalar between 0 and 1

`NumImportanceScoreIterations` — Number of mini-batch updates to use for calculating importance scores
`"auto"` (default) | positive integer

`LayerNames` — Names of layers to compress
string array | cell array of character vectors | character vector

`PruningCheckpointPath` — Path for saving checkpoint neural networks
`''` (default) | string scalar | character vector

`PruningCheckpointFrequency` — Frequency of saving checkpoint neural networks
1 (default) | positive integer

`VerbosityLevel` — Verbosity level
`"summary"` (default) | `"pruning"` | `"training"` | `"off"`

`Plots` — Plots to display during pruning
`"pruning-progress"` (default) | `"none"`

`netPruned` — Pruned network
`dlnetwork` object

`info` — Pruning information
structure