Local features and their descriptors, which are a compact vector representations of a local neighborhood, are the building blocks of many computer vision algorithms. Their applications include image registration, object detection and classification, tracking, and motion estimation. Using local features enables these algorithms to better handle scale changes, rotation, and occlusion. The Computer Vision Toolbox™ provides the FAST, Harris, ORB, and Shi & Tomasi methods for detecting corner features, and the SIFT, SURF, KAZE, and MSER methods for detecting blob features. The toolbox includes the SIFT, SURF, KAZE, FREAK, BRISK, ORB, and HOG descriptors. You can mix and match the detectors and the descriptors depending on the requirements of your application.
Local features refer to a pattern or distinct structure found in an image, such as a point, edge, or small image patch. They are usually associated with an image patch that differs from its immediate surroundings by texture, color, or intensity. What the feature actually represents does not matter, just that it is distinct from its surroundings. Examples of local features are blobs, corners, and edge pixels.
I = imread('circuit.tif'); corners = detectFASTFeatures(I,'MinContrast',0.1); J = insertMarker(I,corners,'circle'); imshow(J)
Local features let you find image correspondences regardless of occlusion, changes in viewing conditions, or the presence of clutter. In addition, the properties of local features make them suitable for image classification, such as in Image Classification with Bag of Visual Words.
Local features are used in two fundamental ways:
To localize anchor points for use in image stitching or 3-D reconstruction.
To represent image contents compactly for detection or classification, without requiring image segmentation.
|Image registration and stitching||Feature Based Panoramic Image Stitching|
|Object detection||Object Detection in a Cluttered Scene Using Point Feature Matching|
|Object recognition||Digit Classification Using HOG Features|
|Object tracking||Face Detection and Tracking Using the KLT Algorithm|
|Image category recognition||Image Category Classification Using Bag of Features|
|Finding geometry of a stereo system||Uncalibrated Stereo Image Rectification|
|3-D reconstruction||Structure From Motion From Two Views, Structure From Motion From Multiple Views|
|Image retrieval||Image Retrieval Using Customized Bag of Features|
Detectors that rely on gradient-based and intensity variation approaches detect good local features. These features include edges, blobs, and regions. Good local features exhibit the following properties:
When given two images of the same scene, most features that the detector finds in both images are the same. The features are robust to changes in viewing conditions and noise.
The neighborhood around the feature center varies enough to allow for a reliable comparison between the features.
The feature has a unique location assigned to it. Changes in viewing conditions do not affect its location.
Feature detection selects regions of an image that have unique content, such as corners or blobs. Use feature detection to find points of interest that you can use for further processing. These points do not necessarily correspond to physical structures, such as the corners of a table. The key to feature detection is to find features that remain locally invariant so that you can detect them even in the presence of rotation or scale change.
Feature extraction involves computing a descriptor, which is typically done on regions centered around detected features. Descriptors rely on image processing to transform a local pixel neighborhood into a compact vector representation. This new representation permits comparison between neighborhoods regardless of changes in scale or orientation. Descriptors, such as SIFT or SURF, rely on local gradient computations. Binary descriptors, such as BRISK, ORB or FREAK, rely on pairs of local intensity differences, which are then encoded into a binary vector.
Select the best feature detector and descriptor by considering the criteria of your application and the nature of your data. The first table helps you understand the general criteria to drive your selection. The next two tables provide details on the detectors and descriptors available in Computer Vision Toolbox.
Considerations for Selecting a Detector and Descriptor
Type of features in your image
Use a detector appropriate for your data. For example, if your image contains an image of bacteria cells, use the blob detector rather than the corner detector. If your image is an aerial view of a city, you can use the corner detector to find man-made structures.
Context in which you are using the features:
The HOG, SURF, and KAZE descriptors are suitable for classification tasks. In contrast, binary descriptors, such as ORB, BRISK and FREAK, are typically used for finding point correspondences between images, which are used for registration.
Type of distortion present in your image
Choose a detector and descriptor that addresses the distortion in your data. For example, if there is no scale change present, consider a corner detector that does not handle scale. If your data contains a higher level of distortion, such as scale and rotation, then use SIFT, SURF, ORB, or KAZE feature detector and descriptor. The SURF and the KAZE methods are computationally intensive.
Binary descriptors are generally faster but less accurate than gradient-based descriptors. For greater accuracy, use several detectors and descriptors at the same time.
Choose a Detection Function Based on Feature Type
|Detector||Feature Type||Function||Scale Independent|
|Minimum eigenvalue algorithm ||Corner||No|
|Corner detector ||Corner||No|
|MSER ||Region with uniform intensity||Yes|
Choose a Descriptor Method
|Descriptor||Binary||Function and Method||Invariance||Typical Use|
|Scale||Rotation||Finding Point Correspondences||Classification|
extractFeatures function provides different extraction
methods to best match the requirements of your application. When you do not specify
'Method' input for the
function, the function automatically selects the method based on the type of input
Binary descriptors are fast but less precise in terms of localization. They are
not suitable for classification tasks. The
function returns a
binaryFeatures object. This object
enables the Hamming-distance-based matching metric used in the
Registering two images is a simple way to understand local features. This example finds a geometric transformation between two images. It uses local features to find well-localized anchor points.
Display two images
The first image is the original image.
original = imread('cameraman.tif'); figure; imshow(original);
The second image is the original image rotated and scaled.
scale = 1.3; J = imresize(original,scale); theta = 31; distorted = imrotate(J,theta); figure imshow(distorted)
Detect matching features between the original and distorted image
Detecting the matching SURF features is the first step in determining the transform needed to correct the distorted image.
ptsOriginal = detectSURFFeatures(original); ptsDistorted = detectSURFFeatures(distorted);
Extract features and compare the detected blobs between the two images
The detection step found several roughly corresponding blob structures in both images. Compare the detected blob features. This process is facilitated by feature extraction, which determines a local patch descriptor.
[featuresOriginal,validPtsOriginal] = ... extractFeatures(original,ptsOriginal); [featuresDistorted,validPtsDistorted] = ... extractFeatures(distorted,ptsDistorted);
It is possible that not all of the original points were used to extract descriptors. Points might have been rejected if they were too close to the image border. Therefore, the valid points are returned in addition to the feature descriptors.
The patch size used to compute the descriptors is determined during the feature extraction step. The patch size corresponds to the scale at which the feature is detected. Regardless of the patch size, the two feature vectors,
featuresDistorted, are computed in such a way that they are of equal length. The descriptors enable you to compare detected features, regardless of their size and rotation.
Find candidate matches
Obtain candidate matches between the features by inputting the descriptors to the
matchFeatures function. Candidate matches imply that the results can contain some invalid matches. Two patches that match can indicate like features but might not be a correct match. A table corner can look like a chair corner, but the two features are obviously not a match.
indexPairs = matchFeatures(featuresOriginal,featuresDistorted);
Find point locations from both images
Each row of the returned
indexPairs contains two indices of candidate feature matches between the images. Use the indices to collect the actual point locations from both images.
matchedOriginal = validPtsOriginal(indexPairs(:,1)); matchedDistorted = validPtsDistorted(indexPairs(:,2));
Display the candidate matches
figure showMatchedFeatures(original,distorted,matchedOriginal,matchedDistorted) title('Candidate matched points (including outliers)')
Analyze the feature locations
If there are a sufficient number of valid matches, remove the false matches. An effective technique for this scenario is the RANSAC algorithm. The
estimateGeometricTransform2D function implements M-estimator sample consensus (MSAC), which is a variant of the RANSAC algorithm. MSAC finds a geometric transform and separates the inliers (correct matches) from the outliers (spurious matches).
[tform, inlierIdx] = estimateGeometricTransform2D( ... matchedDistorted, matchedOriginal,'similarity'); inlierDistorted = matchedDistorted(inlierIdx, :); inlierOriginal = matchedOriginal(inlierIdx, :);
Display the matching points
figure showMatchedFeatures(original,distorted,inlierOriginal,inlierDistorted) title('Matching points (inliers only)') legend('ptsOriginal','ptsDistorted')
Verify the computed geometric transform
Apply the computed geometric transform to the distorted image.
outputView = imref2d(size(original)); recovered = imwarp(distorted,tform,'OutputView',outputView);
Display the recovered image and the original image.
This example builds on the results of the "Use Local Features" example. Using more than one detector and descriptor pair enables you to combine and reinforce your results. Multiple pairs are also useful for when you cannot obtain enough good matches (inliers) using a single feature detector.
Load the original image.
original = imread('cameraman.tif'); figure; imshow(original); text(size(original,2),size(original,1)+15, ... 'Image courtesy of Massachusetts Institute of Technology', ... 'FontSize',7,'HorizontalAlignment','right');
Scale and rotate the original image to create the distorted image.
scale = 1.3; J = imresize(original, scale); theta = 31; distorted = imrotate(J,theta); figure imshow(distorted)
Detect the features in both images. Use the BRISK detectors first, followed by the SURF detectors.
ptsOriginalBRISK = detectBRISKFeatures(original,'MinContrast',0.01); ptsDistortedBRISK = detectBRISKFeatures(distorted,'MinContrast',0.01); ptsOriginalSURF = detectSURFFeatures(original); ptsDistortedSURF = detectSURFFeatures(distorted);
Extract descriptors from the original and distorted images. The BRISK features use the FREAK descriptor by default.
[featuresOriginalFREAK,validPtsOriginalBRISK] = ... extractFeatures(original,ptsOriginalBRISK); [featuresDistortedFREAK,validPtsDistortedBRISK] = ... extractFeatures(distorted,ptsDistortedBRISK); [featuresOriginalSURF,validPtsOriginalSURF] = ... extractFeatures(original,ptsOriginalSURF); [featuresDistortedSURF,validPtsDistortedSURF] = ... extractFeatures(distorted,ptsDistortedSURF);
Determine candidate matches by matching FREAK descriptors first, and then SURF descriptors. To obtain as many feature matches as possible, start with detector and matching thresholds that are lower than the default values. Once you get a working solution, you can gradually increase the thresholds to reduce the computational load required to extract and match features.
indexPairsBRISK = matchFeatures(featuresOriginalFREAK,... featuresDistortedFREAK,'MatchThreshold',40,'MaxRatio',0.8); indexPairsSURF = matchFeatures(featuresOriginalSURF,featuresDistortedSURF);
Obtain candidate matched points for BRISK and SURF.
matchedOriginalBRISK = validPtsOriginalBRISK(indexPairsBRISK(:,1)); matchedDistortedBRISK = validPtsDistortedBRISK(indexPairsBRISK(:,2)); matchedOriginalSURF = validPtsOriginalSURF(indexPairsSURF(:,1)); matchedDistortedSURF = validPtsDistortedSURF(indexPairsSURF(:,2));
Visualize the BRISK putative matches.
figure showMatchedFeatures(original,distorted,matchedOriginalBRISK,... matchedDistortedBRISK) title('Putative matches using BRISK & FREAK') legend('ptsOriginalBRISK','ptsDistortedBRISK')
Combine the candidate matched BRISK and SURF local features. Use the
Location property to combine the point locations from BRISK and SURF features.
matchedOriginalXY = ... [matchedOriginalSURF.Location; matchedOriginalBRISK.Location]; matchedDistortedXY = ... [matchedDistortedSURF.Location; matchedDistortedBRISK.Location];
Determine the inlier points and the geometric transform of the BRISK and SURF features.
[tformTotal,inlierIdx] = ... estimateGeometricTransform2D(matchedDistortedXY,... matchedOriginalXY,'similarity'); inlierDistortedXY = matchedDistortedXY(inlierIdx, :); inlierOriginalXY = matchedOriginalXY(inlierIdx, :);
Display the results. The result provides several more matches than the example that used a single feature detector.
figure showMatchedFeatures(original,distorted,inlierOriginalXY,inlierDistortedXY) title('Matching points using SURF and BRISK (inliers only)') legend('ptsOriginal','ptsDistorted')
Compare the original and recovered image.
outputView = imref2d(size(original)); recovered = imwarp(distorted,tformTotal,'OutputView',outputView); figure; imshowpair(original,recovered,'montage')
 Rosten, E., and T. Drummond. “Machine Learning for High-Speed Corner Detection.” 9th European Conference on Computer Vision. Vol. 1, 2006, pp. 430–443.
 Mikolajczyk, K., and C. Schmid. “A performance evaluation of local descriptors.” IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 27, Issue 10, 2005, pp. 1615–1630.
 Harris, C., and M. J. Stephens. “A Combined Corner and Edge Detector.” Proceedings of the 4th Alvey Vision Conference. August 1988, pp. 147–152.
 Shi, J., and C. Tomasi. “Good Features to Track.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. June 1994, pp. 593–600.
 Tuytelaars, T., and K. Mikolajczyk. “Local Invariant Feature Detectors: A Survey.” Foundations and Trends in Computer Graphics and Vision. Vol. 3, Issue 3, 2007, pp. 177–280.
 Leutenegger, S., M. Chli, and R. Siegwart. “BRISK: Binary Robust Invariant Scalable Keypoints.” Proceedings of the IEEE International Conference. ICCV, 2011.
 Nister, D., and H. Stewenius. "Linear Time Maximally Stable Extremal Regions." 10th European Conference on Computer Vision. Marseille, France: 2008, No. 5303, pp. 183–196.
 Matas, J., O. Chum, M. Urba, and T. Pajdla. "Robust wide-baseline stereo from maximally stable extremal regions."Proceedings of British Machine Vision Conference. 2002, pp. 384–396.
 Obdrzalek D., S. Basovnik, L. Mach, and A. Mikulik. "Detecting Scene Elements Using Maximally Stable Colour Regions."Communications in Computer and Information Science. La Ferte-Bernard, France: 2009, Vol. 82 CCIS (2010 12 01), pp. 107–115.
 Mikolajczyk, K., T. Tuytelaars, C. Schmid, A. Zisserman, T. Kadir, and L. Van Gool. "A Comparison of Affine Region Detectors. "International Journal of Computer Vision. Vol. 65, No. 1–2, November 2005, pp. 43–72 .
 Bay, H., A. Ess, T. Tuytelaars, and L. Van Gool. “SURF: Speeded Up Robust Features.” Computer Vision and Image Understanding (CVIU). Vol. 110, No. 3, 2008, pp. 346–359.
 Alcantarilla, P.F., A. Bartoli, and A.J. Davison. "KAZE Features", ECCV 2012, Part VI, LNCS 7577 pp. 214, 2012
 Rublee, E., V. Rabaud, K. Konolige and G. Bradski. "ORB: An efficient alternative to SIFT or SURF." In Proceedings of the 2011 International Conference on Computer Vision, 2564–2571. Barcelona, Spain, 2011.
 Lowe, David G.. "Distinctive Image Features from Scale-Invariant Keypoints." Int. J. Comput. Vision 60 , no. 2 (2004): 91--110.