主要内容

量化

将网络参数量化为精度降低的数据类型;为生成定点代码准备深度学习网络

将层的权重、偏置和激活量化为精度降低的缩放整数数据类型。然后,您可以从这个量化的网络中为 GPU、FPGA 或 CPU 部署生成 C/C++、CUDA® 或 HDL 代码。

有关 Deep Learning Toolbox™ Model Compression Library 中提供的压缩技术的详细概述,请参阅Reduce Memory Footprint of Deep Neural Networks

Simplified illustration of quantization. On the left is a sketch of a neural network consisting of three layers with two, three, and one neuron, respectively. Each neuron in every layer is connected to all neurons in neighboring layers. An arrow points to a second sketch on the right, which shows the same network with the weights indicated by dotted lines instead of full lines, which indicates that the weights are stored with smaller precision.

函数

dlquantizerQuantize a deep neural network to 8-bit scaled integer data types
dlquantizationOptionsOptions for quantizing a trained deep neural network
prepareNetworkPrepare deep neural network for quantization (自 R2024b 起)
calibrateSimulate and collect ranges of a deep neural network
quantizeQuantize deep neural network (自 R2022a 起)
validateQuantize and validate a deep neural network
quantizationDetailsDisplay quantization details for a neural network (自 R2022a 起)
estimateNetworkMetricsEstimate network metrics for specific layers of a neural network (自 R2022a 起)
equalizeLayersEqualize layer parameters of deep neural network (自 R2022b 起)
exportNetworkToSimulinkGenerate Simulink model that contains deep learning layer blocks and subsystems that correspond to deep learning layer objects (自 R2024b 起)

App

深度网络量化器Quantize deep neural network to 8-bit scaled integer data types

主题

了解量化

预部署工作流

部署

注意事项

精选示例