主要内容

本页翻译不是最新的。点击此处可查看最新英文版本。

量化、投影和剪枝

通过执行量化、投影或剪枝来压缩深度神经网络

将 Deep Learning Toolbox™ 与 Deep Learning Toolbox Model Compression Library 支持包结合使用,通过以下方式减少深度神经网络的内存占用和计算要求:

  • 使用一阶泰勒逼近从卷积层中对滤波器进行剪枝。然后,您可以从这个剪枝过的网络中生成 C/C++ 或 CUDA® 代码。

  • 对层进行投影,先使用代表训练数据的数据集对层激活执行主成分分析 (PCA),然后对层的可学习参数应用线性投影。当您使用无库的 C/C++ 代码生成将网络部署到嵌入式硬件时,投影的深度神经网络的前向传导通常会更快。

  • 将层的权重、偏置和激活量化为精度降低的缩放整数数据类型。然后,您可以从这个量化的网络中为 GPU、FPGA 或 CPU 部署生成 C/C++、CUDA 或 HDL 代码。

  • 使用深度网络设计器对网络进行压缩分析。

有关 Deep Learning Toolbox Model Compression Library 中提供的压缩技术的详细概述,请参阅Reduce Memory Footprint of Deep Neural Networks

函数

全部展开

taylorPrunableNetworkNeural network suitable for compression using Taylor pruning (自 R2022a 起)
forwardCompute deep learning network output for training
predictCompute deep learning network output for inference
updatePrunablesRemove filters from prunable layers based on importance scores (自 R2022a 起)
updateScoreCompute and accumulate Taylor-based importance scores for pruning (自 R2022a 起)
dlnetworkDeep learning neural network
compressNetworkUsingProjectionCompress neural network using projection (自 R2022b 起)
neuronPCAPrincipal component analysis of neuron activations (自 R2022b 起)
unpackProjectedLayersUnpack projected layers of neural network (自 R2023b 起)
ProjectedLayerCompressed neural network layer using projection (自 R2023b 起)
gruProjectedLayerGated recurrent unit (GRU) projected layer for recurrent neural network (RNN) (自 R2023b 起)
lstmProjectedLayerLong short-term memory (LSTM) projected layer for recurrent neural network (RNN) (自 R2022b 起)
dlquantizerQuantize a deep neural network to 8-bit scaled integer data types
dlquantizationOptionsOptions for quantizing a trained deep neural network
prepareNetworkPrepare deep neural network for quantization (自 R2024b 起)
calibrateSimulate and collect ranges of a deep neural network
quantizeQuantize deep neural network (自 R2022a 起)
validateQuantize and validate a deep neural network
quantizationDetailsDisplay quantization details for a neural network (自 R2022a 起)
estimateNetworkMetricsEstimate network metrics for specific layers of a neural network (自 R2022a 起)
equalizeLayersEqualize layer parameters of deep neural network (自 R2022b 起)
exportNetworkToSimulinkGenerate Simulink model that contains deep learning layer blocks and subsystems that correspond to deep learning layer objects (自 R2024b 起)

App

深度网络量化器Quantize deep neural network to 8-bit scaled integer data types

主题

概述

剪枝

投影和知识蒸馏

量化

GPU 目标的量化

FPGA 目标的量化

CPU 目标的量化

精选示例