Main Content

vision.CascadeObjectDetector

Detect objects using the Viola-Jones algorithm

Description

The cascade object detector uses the Viola-Jones algorithm to detect people’s faces, noses, eyes, mouth, or upper body. You can also use the Image Labeler to train a custom classifier to use with this System object. For details on how the function works, see Get Started with Cascade Object Detector.

To detect facial features or upper body in an image:

  1. Create the vision.CascadeObjectDetector object and set its properties.

  2. Call the object with arguments, as if it were a function.

To learn more about how System objects work, see What Are System Objects?

Creation

Description

detector = vision.CascadeObjectDetector creates a detector to detect objects using the Viola-Jones algorithm.

example

detector = vision.CascadeObjectDetector(model) creates a detector configured to detect objects defined by the input character vector, model.

detector = vision.CascadeObjectDetector(XMLFILE) creates a detector and configures it to use the custom classification model specified with the XMLFILE input.

detector = vision.CascadeObjectDetector(Name,Value) sets properties using one or more name-value pairs. Enclose each property name in quotes. For example, detector = vision.CascadeObjectDetector('ClassificationModel','UpperBody')

Properties

expand all

Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and the release function unlocks them.

If a property is tunable, you can change its value at any time.

For more information on changing property values, see System Design in MATLAB Using System Objects.

Trained cascade classification model, specified as a character vector. The ClassificationModel property controls the type of object to detect. By default, the detector is configured to detect faces.

You can set this character vector to an XML file containing a custom classification model, or to one of the valid model character vectors listed below. You can train a custom classification model using the trainCascadeObjectDetector function. The function can train the model using Haar-like features, histograms of oriented gradients (HOG), or local binary patterns (LBP). For details on how to use the function, see Get Started with Cascade Object Detector.

Classification ModelImage Size Used to Train ModelModel Description
'FrontalFaceCART'(Default)[20 20]Detects faces that are upright and forward facing. This model is composed of weak classifiers, based on the classification and regression tree analysis (CART). These classifiers use Haar features to encode facial features. CART-based classifiers provide the ability to model higher-order dependencies between facial features. [1]
'FrontalFaceLBP'[24 24]Detects faces that are upright and forward facing. This model is composed of weak classifiers, based on a decision stump. These classifiers use local binary patterns (LBP) to encode facial features. LBP features can provide robustness against variation in illumination. [2]
'UpperBody'[18 22]Detects the upper-body region, which is defined as the head and shoulders area. This model uses Haar features to encode the details of the head and shoulder region. Because it uses more features around the head, this model is more robust against pose changes, e.g. head rotations/tilts. [3]
'EyePairBig'
'EyePairSmall'
[11 45]
[5 22]
Detects a pair of eyes. The 'EyePairSmall' model is trained using a smaller image. This enables the model to detect smaller eyes than the 'EyePairBig' model can detect.[4]
'LeftEye'
'RightEye'
[12 18]Detects the left and right eye separately. These models are composed of weak classifiers, based on a decision stump. These classifiers use Haar features to encode details.[4]
'LeftEyeCART'
'RightEyeCART'
[20 20]Detects the left and right eye separately. The weak classifiers that make up these models are CART-trees. Compared to decision stumps, CART-tree-based classifiers are better able to model higher-order dependencies. [5]
'ProfileFace'[20 20]Detects upright face profiles. This model is composed of weak classifiers, based on a decision stump. These classifiers use Haar features to encode face details.
'Mouth'[15 25]Detects the mouth. This model is composed of weak classifiers, based on a decision stump, which use Haar features to encode mouth details.[4]
'Nose'[15 18]This model is composed of weak classifiers, based on a decision stump, which use Haar features to encode nose details.[4]

Size of smallest detectable object, specified as a two-element vector [height width]. Set this property in pixels for the minimum size region containing an object. The value must be greater than or equal to the image size used to train the model. Use this property to reduce computation time when you know the minimum object size prior to processing the image. When you do not specify a value for this property, the detector sets it to the size of the image used to train the classification model.

For details explaining the relationship between setting the size of the detectable object and the ScaleFactor property, see Algorithms section.

Tunable: Yes

Size of largest detectable object, specified as a two-element vector [height width]. Specify the size in pixels of the largest object to detect. Use this property to reduce computation time when you know the maximum object size prior to processing the image. When you do not specify a value for this property, the detector sets it to size(I).

For details explaining the relationship between setting the size of the detectable object and the ScaleFactor property, see the Algorithms section.

Scaling for multiscale object detection, specified as a value greater than 1.0001. The scale factor incrementally scales the detection resolution between MinSize and MaxSize. You can set the scale factor to an ideal value using:

size(I)/(size(I)-0.5)

The detector scales the search region at increments between MinSize and MaxSize using the following relationship:

search region = round((Training Size)*(ScaleFactorN))

N is the current increment, an integer greater than zero, and Training Size is the image size used to train the classification model.

Tunable: Yes

Detection threshold, specified as an integer. The threshold defines the criteria needed to declare a final detection in an area where there are multiple detections around an object. Groups of colocated detections that meet the threshold are merged to produce one bounding box around the target object. Increasing this threshold may help suppress false detections by requiring that the target object be detected multiple times during the multiscale detection phase. When you set this property to 0, all detections are returned without performing thresholding or merging operation. This property is tunable.

Use region of interest, specified as false or true. Set this property to true to detect objects within a rectangular region of interest within the input image.

Usage

Description

bbox = detector(I) returns an M-by-4 matrix, bbox, that defines M bounding boxes containing the detected objects. The detector performs multiscale object detection on the input image, I.

bbox = detector(I,roi) detects objects within the rectangular search region specified by roi. Set the 'UseROI' property to true to use this syntax.I is a grayscale or truecolor (RGB) image.

detectionResults = detector(ds) detects objects within all the images returned by the read function of the input datastore.

Input Arguments

expand all

Input image, specified as grayscale or truecolor (RGB).

Datastore, specified as a datastore object containing a collection of images. Each image must be grayscale or RGB. The function processes only the first column of the datastore, which must contain images and must be cell arrays or tables with multiple columns. Therefore, datastore read function must return image data in the first column.

Classification model, specified as a character vector. The model input describes the type of object to detect. There are several valid model character vectors, such as 'FrontalFaceCART', 'UpperBody', and 'ProfileFace'. See the ClassificationModel property description for a full list of available models.

Custom classification model, specified as an XML file. The XMLFILE can be created using the trainCascadeObjectDetector function or OpenCV (Open Source Computer Vision) training functionality. You must specify a full or relative path to the XMLFILE, if it is not on the MATLAB® path.

Rectangular region of interest within image I, specified as a four-element vector, [x y width height].

Output Arguments

expand all

Detections, returned as an M-by-4 element matrix. Each row of the output matrix contains a four-element vector, [x y width height], that specifies in pixels, the upper-left corner and size of a bounding box.

Detection results, returned as a 3-column table with variable names, Boxes, Scores, and Labels. The Boxes column contains M-by-4 matrices, of M bounding boxes for the objects found in the image. Each row contains a bounding box as a 4-element vector in the format [x,y,width,height]. The format specifies the upper-left corner location and size in pixels of the bounding box in the corresponding image.

Object Functions

To use an object function, specify the System object™ as the first input argument. For example, to release system resources of a System object named obj, use this syntax:

release(obj)

expand all

stepRun System object algorithm
releaseRelease resources and allow changes to System object property values and input characteristics
resetReset internal states of System object

Examples

collapse all

Create a face detector object.

faceDetector = vision.CascadeObjectDetector;

Read the input image.

I = imread('visionteam.jpg');

Detect faces.

bboxes = faceDetector(I);

Annotate detected faces.

IFaces = insertObjectAnnotation(I,'rectangle',bboxes,'Face');   
figure
imshow(IFaces)
title('Detected faces');

Figure contains an axes object. The hidden axes object with title Detected faces contains an object of type image.

Create a body detector object and set properties.

bodyDetector = vision.CascadeObjectDetector('UpperBody'); 
bodyDetector.MinSize = [60 60];
bodyDetector.MergeThreshold = 10;

Read input image and detect upper body.

I2 = imread('visionteam.jpg');
bboxBody = bodyDetector(I2);

Annotate detected upper bodies.

IBody = insertObjectAnnotation(I2,'rectangle',bboxBody,'Upper Body');
figure
imshow(IBody)
title('Detected upper bodies');

Figure contains an axes object. The hidden axes object with title Detected upper bodies contains an object of type image.

Algorithms

expand all

References

[1] Lienhart R., Kuranov A., and V. Pisarevsky "Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection." Proceedings of the 25th DAGM Symposium on Pattern Recognition. Magdeburg, Germany, 2003.

[2] Ojala Timo, Pietikäinen Matti, and Mäenpää Topi, "Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns". In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002. Volume 24, Issue 7, pp. 971-987.

[3] Kruppa H., Castrillon-Santana M., and B. Schiele. "Fast and Robust Face Finding via Local Context". Proceedings of the Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2003, pp. 157–164.

[4] Castrillón Marco, Déniz Oscar, Guerra Cayetano, and Hernández Mario, "ENCARA2: Real-time detection of multiple faces at different resolutions in video streams". In Journal of Visual Communication and Image Representation, 2007 (18) 2: pp. 130-140.

[5] Yu Shiqi "Eye Detection." Shiqi Yu’s Homepage. http://yushiqi.cn/research/eyedetection.

[6] Viola, Paul and Michael J. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features" , Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. Volume: 1, pp.511–518.

[7] Dalal, N., and B. Triggs, "Histograms of Oriented Gradients for Human Detection". IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Volume 1, (2005), pp. 886–893.

[8] Ojala, T., M. Pietikainen, and T. Maenpaa, "Multiresolution Gray-scale and Rotation Invariant Texture Classification With Local Binary Patterns". IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 24, No. 7 July 2002, pp. 971–987.

Extended Capabilities

Version History

Introduced in R2012a