量化、投影和剪枝
通过执行量化、投影或剪枝来压缩深度神经网络
将 Deep Learning Toolbox™ 与 Deep Learning Toolbox Model Compression Library 支持包结合使用,通过以下方式减少深度神经网络的内存占用和计算要求:
使用一阶泰勒逼近从卷积层中对滤波器进行剪枝。然后,您可以从这个剪枝过的网络中生成 C/C++ 或 CUDA® 代码。
对层进行投影,先使用代表训练数据的数据集对层激活执行主成分分析 (PCA),然后对层的可学习参数应用线性投影。当您使用无库的 C/C++ 代码生成将网络部署到嵌入式硬件时,投影的深度神经网络的前向传导通常会更快。
将层的权重、偏置和激活量化为精度降低的缩放整数数据类型。然后,您可以从这个量化的网络中为 GPU、FPGA 或 CPU 部署生成 C/C++、CUDA 或 HDL 代码。
使用深度网络设计器对网络进行压缩分析。
有关 Deep Learning Toolbox Model Compression Library 中提供的压缩技术的详细概述,请参阅Reduce Memory Footprint of Deep Neural Networks。
函数
App
| 深度网络量化器 | Quantize deep neural network to 8-bit scaled integer data types |
主题
概述
- Reduce Memory Footprint of Deep Neural Networks
Learn about neural network compression techniques, including pruning, projection, and quantization.
剪枝
- Analyze and Compress 1-D Convolutional Neural Network
Analyze 1-D convolutional network for compression and compress it using Taylor pruning and projection. (自 R2024b 起) - Parameter Pruning and Quantization of Image Classification Network
Use parameter pruning and quantization to reduce network size. - Prune Image Classification Network Using Taylor Scores
Reduce the size of a deep neural network using Taylor pruning. - Prune Filters in a Detection Network Using Taylor Scores
Reduce network size and increase inference speed by pruning convolutional filters in a you only look once (YOLO) v3 object detection network. - Prune and Quantize Convolutional Neural Network for Speech Recognition
Compress a convolutional neural network (CNN) to prepare it for deployment on an embedded system.
投影和知识蒸馏
- Compress Neural Network Using Projection
Compress a neural network using projection and principal component analysis (PCA). - Evaluate Code Generation Inference Time of Compressed Deep Neural Network
This example shows how to compare the inference time of a compressed deep neural network for battery state of charge estimation. (自 R2023b 起) - Train Smaller Neural Network Using Knowledge Distillation
Reduce the memory footprint of a deep neural network using knowledge distillation. (自 R2023b 起)
量化
- Quantization of Deep Neural Networks
Learn about deep learning quantization tools and workflows. - Data Types and Scaling for Quantization of Deep Neural Networks
Understand effects of quantization and how to visualize dynamic ranges of network convolution layers. - Quantization Workflow System Requirements
See what products are required for the quantization of deep neural networks. - Supported Layers for Quantization
Learn which deep neural network layers are supported for quantization. - Prepare Data for Quantizing Networks
Learn about supported data formats for quantization workflows. - Quantize Multiple-Input Network Using Image and Feature Data
Quantize a network with multiple inputs. - Export Quantized Networks to Simulink and Generate Code
Export a quantized neural network to Simulink and generate code from the exported model.
GPU 目标的量化
- Generate INT8 Code for Deep Learning Networks (GPU Coder)
Quantize and generate code for a pretrained convolutional neural network. - Quantize Residual Network Trained for Image Classification and Generate CUDA Code
This example shows how to quantize the learnable parameters in the convolution layers of a deep learning neural network that has residual connections and has been trained for image classification with CIFAR-10 data. - Quantize Layers in Object Detectors and Generate CUDA Code
This example shows how to generate CUDA® code for an SSD vehicle detector and a YOLO v2 vehicle detector that performs inference computations in 8-bit integers for the convolutional layers. - Quantize Semantic Segmentation Network and Generate CUDA Code
Quantize a convolutional neural network trained for semantic segmentation and generate CUDA code.
FPGA 目标的量化
- Quantize Network for FPGA Deployment (Deep Learning HDL Toolbox)
Reduce the memory footprint of a deep neural network by quantizing the weights, biases, and activations of convolution layers to 8-bit scaled integer data types. - Classify Images on FPGA Using Quantized Neural Network (Deep Learning HDL Toolbox)
This example shows how to use Deep Learning HDL Toolbox™ to deploy a quantized deep convolutional neural network (CNN) to an FPGA. - Classify Images on FPGA by Using Quantized GoogLeNet Network (Deep Learning HDL Toolbox)
This example shows how to use the Deep Learning HDL Toolbox™ to deploy a quantized GoogleNet network to classify an image.
CPU 目标的量化
- Generate int8 Code for Deep Learning Networks (MATLAB Coder)
Quantize and generate code for a pretrained convolutional neural network. - Generate INT8 Code for Deep Learning Network on Raspberry Pi (MATLAB Coder)
Generate code for deep learning network that performs inference computations in 8-bit integers. - Compress Image Classification Network for Deployment to Resource-Constrained Embedded Devices
Reduce the memory footprint and computation requirements of an image classification network for deployment to resource-constrained embedded devices such as the Raspberry Pi®.




