Main Content

量化和剪枝

通过执行量化或剪枝来压缩深度神经网络

将 Deep Learning Toolbox™ 和 Deep Learning Toolbox Model Quantization Library 支持包结合使用,通过以下方式减少深度神经网络的内存占用和计算要求:

  • 将层的权重、偏置和激活量化为精度降低的缩放整数数据类型。然后,您可以从这个量化的网络中生成 C/C++、CUDA® 或 HDL 代码。

  • 使用一阶泰勒逼近从卷积层中对滤波器进行剪枝。然后,您可以从这个剪枝过的网络中生成 C/C++ 或 CUDA 代码。

函数

全部展开

dlquantizerQuantize a deep neural network to 8-bit scaled integer data types
dlquantizationOptionsOptions for quantizing a trained deep neural network
calibrateSimulate and collect ranges of a deep neural network
validateQuantize and validate a deep neural network
quantizeCreate quantized deep neural network
estimateNetworkMetricsEstimate metrics for a specific layers of a neural network
quantizationDetailsDisplay the details for a quantized network
taylorPrunableNetworkNetwork that can be pruned by using first-order Taylor approximation
forwardCompute deep learning network output for training
predictCompute deep learning network output for inference
updatePrunablesRemove filters from prunable layers based on importance scores
updateScoreCompute and accumulate Taylor-based importance scores for pruning
dlnetworkDeep learning network for custom training loops

App

深度网络量化器Quantize a deep neural network to 8-bit scaled integer data types

主题

深度学习量化

GPU 目标的量化

FPGA 目标的量化

CPU 目标的量化

剪枝