Deep Learning Toolbox Model Compression Library
Optimize deep learning models with efficient compression techniques
2.6K 次下载
更新时间
2024/12/11
Deep Learning Toolbox Model Compression Library enables compression of your deep learning models with pruning, projection, and quantization to reduce their memory footprint and computational requirements.
Pruning and projection are structural compression techniques that reduce the size of deep neural networks by removing learnables and filters that have the smallest impact on inference accuracy.
Quantization to 8-bit integers (INT8) is supported for CPUs, FPGAs, and NVIDIA GPUs, for supported layers. The library enables you to collect layer-level data on the weights, activations, and intermediate computations. Using this data, the library quantizes your model and provides metrics to validate the accuracy of the quantized network against the single precision baseline. The iterative workflow allows you to optimize the quantization strategy.
As of R2024b, you can export quantized networks to Simulink deep learning layer blocks for simulation and deployment to embedded systems.
Please refer to the documentation here: https://www.mathworks.com/help/deeplearning/quantization.html
Quantization Workflow Prerequisites can be found here:
If you have download or installation problems, please contact Technical Support - www.mathworks.com/contact_ts
Additional Resources
- Learn more about MATLAB and Simulink for tinyML
- Quantization Aware Training (QAT) with MobileNet-v2 (Example, GitHub Repo)
- Overview Video - https://www.youtube.com/watch?v=jufOpBeSvHM
MATLAB 版本兼容性
创建方式
R2020a
兼容 R2020a 到 R2025a 的版本
平台兼容性
Windows macOS (Apple 芯片) macOS (Intel) Linux类别
在 Help Center 和 MATLAB Answers 中查找有关 Deep Learning Toolbox 的更多信息
标签
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!