Choose an Object Detector

The Computer Vision Toolbox™ provides object detectors to use for detecting and classifying objects in an image or video. Train a detector using an object detector function, then use it with machine learning and deep learning to quickly and accurately predict the location of an object in an image.

When choosing a detector, consider whether you need these features:

Application and Performance

Single vs multiple classes — Multiple classes require a variation of different classifiers used at multiple locations and scales on the image or video.
Runtime performance — Detectors vary in performance depending on the time it takes to detect objects in an image. A detector trained for a single class, or a detector trained to detect objects that are similar in pose and shape, will have a faster runtime performance than a deep learning detector trained on multiple objects. More importantly, deep learning is slower because it requires more computations than machine learning or feature-based detection approaches.
Machine learning — Machine learning uses two types of techniques: supervised learning, which trains a model on known input and output data so that it can predict future outputs, and unsupervised learning, which finds hidden patterns or intrinsic structures in input data. For more details, see Machine Learning in MATLAB (Statistics and Machine Learning Toolbox).
Deep learning — Implement deep neural networks with algorithms, pretrained models, and apps. You can use convolutional neural networks to perform classification and regression on images. For more details, see Get Started with Object Detection Using Deep Learning.

Deployment

C/C++ code generation — SSD, YOLO, ACF, and system object-based detectors support MATLAB^® Coder™ C and C++ code generation for a variety of hardware platforms, from desktop systems to embedded hardware. For more details, see MATLAB Coder.
GPU code generation — Deep learning-based detectors support GPU code generation with optimized CUDA^® by GPU Coder™ for embedded vision, and autonomous systems. For more details, see GPU Coder.

To view and compare the object detector functionality, use the tables in the following sections:

Modern One-Stage Object Detectors

Detector	Multiple Classes Support	Deep Learning Support	Code Generation Support	GPU Support	Examples	Description
`rtmdetObjectDetector`	Yes	Yes	Yes	Yes	Detect Objects Using Pretrained RTMDet Object Detector	RTMDet is an anchor-free object detector that supports training at full resolution, single shot inference at full resolution, and training with tiled images. Requires GPU for optimal performance. Use this detector when you need to balance high performance and efficiency, particularly in scenarios that demand real-time processing with limited computational resources. Choose RTMDet over YOLO-based detectors when your data contains a diverse set of object classes to maintain high detection precision across various scales and orientations.
`yoloxObjectDetector` (Automated Visual Inspection Library for Computer Vision Toolbox)	Yes	Yes	Yes	Yes	Getting Started with YOLOX for Object Detection Detect Defects on Printed Circuit Boards Using YOLOX Network	YOLOX is an anchor-free object detector that supports training at full resolution, single shot inference at full resolution, and training with tiled images. Requires GPU for optimal performance. Use this detector when you need to have increased performance over YOLO v4 for small object detection, or to use arbitrary image size at both training and inference.
`yolov4ObjectDetector`	Yes	Yes	Yes	Yes	Object Detection Using YOLO v4 Deep Learning Object Detection in Large Satellite Imagery Using Deep Learning	YOLO v4 is a single stage object detector that is faster and more accurate than YOLO v3. The detector uses spatial pyramid pooling and path aggregation network for computing aggregated features and is capable of detecting small objects of different sizes. Requires GPU for optimal performance.
`yolov3ObjectDetector`	Yes	Yes	Yes	Yes	Object Detection Using YOLO v3 Deep Learning	YOLO v3 is a single stage network that uses multi-scale features to better handle detection of objects of various sizes. Consider using YOLO v4 for increased performance speed and accuracy. Requires GPU for optimal performance.
`yolov2ObjectDetector`	Yes	Yes	Yes	Yes	Object Detection Using YOLO v2 Deep Learning Multiclass Object Detection Using YOLO v2 Deep Learning	YOLO v2 uses a single stage network to perform object detection. Consider using SSD or YOLO v4 for better performance across various sizes. Requires GPU for optimal performance.
`ssdObjectDetector`	Yes	Yes	Yes	Yes	Object Detection Using SSD Deep Learning	Single shot detector (SSD) uses a single stage detection network to detects objects using multi-scale features. Requires GPU for optimal performance. Use this detector when you need to detect objects of various sizes and better runtime performance is desired. Better runtime performance than YOLO v2.

Object Detectors for Rigid Object Detection

Detector Multiple Classes Support Deep Learning Support Code Generation Support GPU Support Example Description

Detector	Multiple Classes Support	Deep Learning Support	Code Generation Support	GPU Support	Example	Description
`acfObjectDetector`	No	No	Yes	No	Train ACF-based Stop Sign Detector	A rigid object detector that is suited for single class object detection. Consider using a deep learning object detector if you need to detect multiple object classes or have objects that belong to the same class but are in different configurations or poses. Use this detector when the object you want to detect has similar pose and shape, and when runtime performance is critical. Better runtime performance than deep-learning-based detectors on CPU. ACF would not work well for detecting vehicles from various viewpoints, such as front, side, and rear.
`vision.CascadeObjectDetector`	No	No	Yes	No	Detect Faces in an Image Using the Frontal Face Classification Model	Viola-Jones object detector suitable for rigid object detection. Uses HAAR, HOG, or LBP features. If training a new detector, consider starting with ACF for better performance. Use this detector when a pretrained detector is available for an object class you're interested in detecting, and there is little variation in the object's pose or shape.

acfObjectDetector

Yes

Train ACF-based Stop Sign Detector

A rigid object detector that is suited for single class object detection.
Consider using a deep learning object detector if you need to detect multiple object classes or have objects that belong to the same class but are in different configurations or poses.
Use this detector when the object you want to detect has similar pose and shape, and when runtime performance is critical.
Better runtime performance than deep-learning-based detectors on CPU.

ACF would not work well for detecting vehicles from various viewpoints, such as front, side, and rear.

vision.CascadeObjectDetector

Yes

Detect Faces in an Image Using the Frontal Face Classification Model

Viola-Jones object detector suitable for rigid object detection. Uses HAAR, HOG, or LBP features.
If training a new detector, consider starting with ACF for better performance.
Use this detector when a pretrained detector is available for an object class you're interested in detecting, and there is little variation in the object's pose or shape.

Object Detectors for Detecting Vehicles and People

Detector	Multiple Classes Support	Deep Learning Support	Code Generation Support	GPU Support	Example	Description
`vehicleDetectorACF` (Automated Driving Toolbox)	Pretrained	No	Yes	No	Track Multiple Vehicles Using a Camera (Automated Driving Toolbox)	Pretrained ACF detector
`vehicleDetectorFasterRCNN` (Automated Driving Toolbox)	Pretrained	Yes	No	Yes	Train a Deep Learning Vehicle Detector (Automated Driving Toolbox)	Pretrained Faster R-CNN detector
`vehicleDetectorYOLOv2` (Automated Driving Toolbox)	Pretrained	Yes	Yes	Yes	Detect Vehicles Using Monocular Camera and YOLO v2 (Automated Driving Toolbox)	Pretrained YOLO v2 detector
`peopleDetectorACF`	Pretrained	No	Yes	No	Tracking Pedestrians from a Moving Car	Use this pretrained detector to detect upright positioned people.
`vision.PeopleDetector`	Pretrained	No	Yes	No	Depth Estimation from Stereo Video	Use this pretrained cascade object detector to detect upright positioned people.