Main Content

Get Started with Object Detection Using Deep Learning

Object detection using deep learning provides a fast and accurate means to predict the location of an object in an image. Deep learning is a powerful machine learning technique in which the object detector automatically learns image features required for detection tasks. Computer Vision Toolbox™ offers several techniques for object detection using deep learning, such as you only look once (YOLO) v2, YOLO v3, YOLO v4, YOLOX, RTMDet, and single shot detection (SSD).

Object detection enables you to localize and categorize objects within image data.

Applications that use object detection include:

  • Scene understanding

  • Multi-object tracking

  • Visual inspection

  • Self-driving vehicles

  • Surveillance

Computer Vision Toolbox and its support packages enable you to configure a pretrained object detection or design a custom object detection network, perform inference using a pretrained or trained network, and perform transfer learning on a custom data set.

You can also design a custom network layer-by-layer using the Deep Network Designer (Deep Learning Toolbox) app. For an example using the YOLO v2 object detection network, see Perform Transfer Learning Using Pretrained YOLO v2 Detector.

Detect Objects Using Pretrained Object Detection Network

Computer Vision Toolbox provides pretrained object detection models that you can use to perform out-of-the-box inference or transfer learning on a custom data set.

Configure Pretrained Model

To use a pretrained object detection model, you must first download and install the pretrained object detection model. You can download and install a pretrained model support package using the Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

This table lists the names of the object detector objects, the corresponding available pretrained models, and the names of the corresponding add-on support packages to download.

Object Detection ModelAvailable Pretrained ModelsName of Support Package
yolov2ObjectDetector
  • darknet19-coco

  • tiny-yolov2-coco

Computer Vision Toolbox Model for YOLO v2 Object Detection
yolov3ObjectDetector
  • darknet53-coco

  • tiny-yolov3-coco

Computer Vision Toolbox Model for YOLO v3 Object Detection
yolov4ObjectDetector
  • csp-darknet53-coco

  • tiny-yolov4-coco

Computer Vision Toolbox Model for YOLO v4 Object Detection
yoloxObjectDetector
  • nano-coco

  • tiny-coco

  • small-coco

  • medium-coco

  • large-coco

Automated Visual Inspection Library for Computer Vision Toolbox

rtmdetObjectDetector

  • tiny-network-coco

  • small-network-coco

  • medium-network-coco

  • large-network-coco

Computer Vision Toolbox Model for RTMDet Object Detection

Perform Inference Using Pretrained Model

Perform inference and detect objects in a test image using a pretrained detector model. For help selecting a pretrained object detection network for your application, see Choose an Object Detector. To return bounding boxes, confidence scores, and corresponding class labels, pass the pretrained detector object to the corresponding detect object function.

For example, to use the pretrained YOLO v4 tiny-yolov4-coco network listed in the Configure Pretrained Model section, load the model by creating a yolov4ObjectDetector object.

detector = yolov4ObjectDetector("tiny-yolov4-coco");

Detect objects in a test image, I, by using the detect object function of the yolov4ObjectDetector object.

I = imread("carsonroad.png");
[bboxes,scores,labels] = detect(detector,I);

Display the results overlaid on the input image by using the insertObjectAnnotation function.

detectedImg = insertObjectAnnotation(I,"Rectangle",bboxes,labels);
figure
imshow(detectedImg)

You can detect the objects in a test image, such as cars, using a pretrained network, such as Tiny YOLO v4 COCO network.

To perform inference on a test image using a trained object detection network, use the same process but specify the trained network to the detect function as the detector argument.

MathWorks GitHub Pretrained Networks

The MathWorks® GitHub repository provides implementations of the latest pretrained object detection deep learning networks to download and use to perform out-of-the-box inference. The pretrained object detection networks have already been trained on standard data sets, such as the COCO and Pascal VOC data sets. You can use these pretrained models directly to detect different objects in a test image.

For a list of all the latest MathWorks pretrained object detectors, see MATLAB Deep Learning (GitHub).

Train Object Detection Network and Perform Transfer Learning

To modify a network to detect additional classes, or to customize other network parameters, you can perform transfer learning. This section shows how to prepare your training data, configure the object detection network, and train the network to perform transfer learning.

Create Training Data

Use a labeling app to interactively label ground truth data in a video, image sequence, image collection, or custom data source. You can label object detection ground truth using rectangle labels, which define the position and size of the object in the image.

You can interactively label ground truth data in images using the Image Labeler App.

To learn more about labeling images for object detection, see these topics:

Augment and Preprocess Data

Use data augmentation to train the object detector on a limited data set. By altering the data set images in minor ways, such as translating, cropping, or transforming, you can create distinct and unique training data, creating a more robust detector. Use datastores to conveniently read and augment collections of data. Use imageDatastore and the boxLabelDatastore to create datastores for images and labeled bounding box data, respectively.

To learn more about augmenting and pre-processing data for training, see these topics:

For more information about augmenting training data using datastores, see Datastores for Deep Learning (Deep Learning Toolbox) and Perform Additional Image Processing Operations Using Built-In Datastores (Deep Learning Toolbox).

Train Object Detector

To train the object detection network, use a training function that corresponds to your object detection model. For example, use the trainYOLOv4ObjectDetector function if you are using the yolov4ObjectDetector object to configure the detector.

Specify the network training options using the trainingOptions (Deep Learning Toolbox) function. You can determine training options parameters using the Experiment Manager (Deep Learning Toolbox) app. For more information on using Experiment Manager for hyperparameter tuning, see Train Object Detectors in Experiment Manager.

To learn more about training, inference, and evaluating your results, see these examples:

Evaluate and Fine-tune Object Detector Performance

To evaluate the training results against the ground truth with a comprehensive set of metrics, use the evaluateObjectDetection function. The function returns the object detection metrics as an objectDetectionMetrics object. Use these objectDetectionMetrics object functions to evaluate metrics across all, or a selection of, classes and overlap thresholds.

objectDetectionMetrics Object FunctionUsage

averagePrecision

Compute average precision (AP) for all or selected classes and overlap (intersection-over-union) thresholds in your data set

precisionRecall

Compute precision, recall, and confidence scores for all classes in the data set, or for specified classes and overlap thresholds

confusionMatrix

Compute the confusion matrix and normalized confusion matrix at specified confidence score threshold or overlap threshold values

summarize

Compute the summary of the object detection metrics over the entire data set, or over each class

For an example that shows how to use object detection metrics to evaluate and fine-tune an object detector, see the Multiclass Object Detection Using YOLO v2 Deep Learning example. This image shows a sample precision-recall (PR) plot, and the recall and precision plots as a function of confidence score, for selected classes in a data set.

This image shows the precision-recall plots for selected classes, at a single overlap threshold, which you can use to determine the optimal detection threshold.

See Also

Apps

Related Topics