Main Content

Code Generation for Object Detection Using YOLO v4 Deep Learning

This example shows how to generate a CUDA® executable for a You Only Look Once v4 (YOLO v4) object detector. The generated code is plain CUDA code that does not contains dependencies to the NVIDIA cuDNN or TensorRT deep learning libraries. This example uses a lightweight version of the YOLO v4 network with fewer network layers and a feature pyramid network as the neck and two YOLO v4 detection heads. The network is trained on the COCO dataset. For more information about the YOLO v4 object detection network, see Getting Started with YOLO v4 (Computer Vision Toolbox) and yolov4ObjectDetector (Computer Vision Toolbox).

Third-Party Prerequisites

  • CUDA enabled NVIDIA® GPU and compatible driver.

For non-MEX builds, such as static, dynamic libraries or executables, this example also requires:

Verify GPU Environment

To verify that the compilers and libraries for this example are set up correctly, use the coder.checkGpuInstall function.

envCfg = coder.gpuEnvConfig('host');
envCfg.DeepLibTarget = 'none'; 
envCfg.DeepCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);

Load Pretrained Network

This example uses a pretrained YOLO v4 object detection network trained on the COCO dataset. The object detector can detect and identify 80 different objects. To use this network, download and install Computer Vision Toolbox Model for YOLO v4 Object Detection from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

Specify the name for the network and save the network to a MAT-file.

name = "tiny-yolov4-coco";
vehicleDetector =  yolov4ObjectDetector(name);
save('tinyyolov4coco.mat','vehicleDetector');
disp(vehicleDetector)
  yolov4ObjectDetector with properties:

             Network: [1×1 dlnetwork]
         AnchorBoxes: {2×1 cell}
          ClassNames: {80×1 cell}
           InputSize: [416 416 3]
    PredictedBoxType: 'axis-aligned'
           ModelName: 'tiny-yolov4-coco'

Download Test Traffic Video

To test the model, download the video file from the MathWorks® website. The file is approximately 40 MB in size.

if ~exist('./downtown_short.mp4', 'file')
	url = 'https://www.mathworks.com/supportfiles/gpucoder/media/downtown_short.mp4';
	websave('downtown_short.mp4', url);
end

The tinyyolov4cocoDetect Entry-Point Function

The tinyyolov4Detect entry-point function runs the detector on the video file by using the deep learning network in the tinyyolov4coco.mat file. The function loads the network object from the tinyyolov4coco.mat file into a persistent variable yolov4Obj and reuses the persistent object during subsequent detection calls. Then it sets up the video file reader to read the input video and creates a video player to display the video and the output detections.

type('tinyyolov4cocoDetect.m')
function tinyyolov4cocoDetect()
%#codegen

%   Copyright 2022 The MathWorks, Inc.

persistent yolov4Obj;

if isempty(yolov4Obj)
    yolov4Obj = coder.loadDeepLearningNetwork('tinyyolov4coco.mat');
end

% Read the input video and create a video player
videoFile = 'downtown_short.mp4';

videoFreader = VideoReader(videoFile);
depVideoPlayer = vision.DeployableVideoPlayer();

cont = hasFrame(videoFreader);
while cont
    I = readFrame(videoFreader);
    in = imresize(I, [416,416]);
    % Call to detect method
    [bboxes, ~, labels] = detect(yolov4Obj, in, Threshold = 0.3);
    
    % Convert categorical labels to cell array of character vectors
    labels = cellstr(labels);
    
    % Annotate detections in the image.
    outImg = insertObjectAnnotation(in, 'rectangle', bboxes, labels);

    step(depVideoPlayer, outImg); % display video
    cont = hasFrame(videoFreader); 
%     pause(0.05); % adjust frame rate
end

Generate Executable

To generate the CUDA executable code for the entry-point function, create a GPU code configuration object. Use the coder.DeepLearningConfig function to create a deep learning configuration object and assign it to the DeepLearningConfig property of the GPU code configuration object. To generate plain CUDA code that has no dependencies on NVIDIA's deep learning libraries, specify TargetLibrary='none'. Set the configuration object property GenerateExampleMain to GenerateCodeAndCompile to automatically generate an example main and compile an executable. Then run the codegen command.

cfg = coder.gpuConfig('exe');
cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary='none');
cfg.GenerateExampleMain = 'GenerateCodeAndCompile';

codegen -config cfg tinyyolov4cocoDetect -report
Code generation successful: View report

Execute Standalone Code

When you run the generated standalone executable, it displays the detection results frame-by-frame.

References

[1] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. “YOLOv4: Optimal Speed and Accuracy of Object Detection.” arXiv, April 22, 2020. http://arxiv.org/abs/2004.10934.

[2] Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi. “You Only Look Once: Unified, Real-Time Object Detection.” In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 779–88. Las Vegas, NV, USA: IEEE, 2016. https://doi.org/10.1109/CVPR.2016.91.

[3] Lin, Tsung-Yi, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. “Microsoft COCO: Common Objects in Context.” In Computer Vision – ECCV 2014, edited by David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, 740–55. Lecture Notes in Computer Science. Cham: Springer International Publishing, 2014. https://doi.org/10.1007/978-3-319-10602-1_48.

See Also

Functions

Objects

Related Examples

More About