Network Compression Applications

Compress a deep neural network by performing quantization, learnables compression, or pruning

Generate code for deep learning networks with reduced memory footprint and computational requirements.

Topics

Generate INT8 Code for Deep Learning Networks
Quantize and generate code for a pretrained convolutional neural network.

Featured Examples

Quantize Residual Network Trained for Image Classification and Generate CUDA Code

Quantize Residual Network Trained for Image Classification and Generate CUDA Code

Quantize learnable parameters in the convolution layers of a residual network and generate CUDA code.

Open Live Script

Quantize Layers in Object Detectors and Generate CUDA Code

Quantize Layers in Object Detectors and Generate CUDA Code

Generate CUDA^® code for SSD and YOLO v2 vehicle detectors that perform inference computations in 8-bit integers.

Open Live Script