Image Retrieval with Bag of Visual Words
You can use the Computer Vision Toolbox™ functions to search by image, also known as a content-based image retrieval (CBIR) system. CBIR systems are used to retrieve images from a collection of images that are similar to a query image. The application of these types of systems can be found in many areas such as a web-based product search, surveillance, and visual place identification. First the system searches a collection of images to find the ones that are visually similar to a query image.
The retrieval system uses a bag of visual words, a collection of image descriptors, to represent your data set of images. Images are indexed to create a mapping of visual words. The index maps each visual word to their occurrences in the image set. A comparison between the query image and the index provides the images most similar to the query image. By using the CBIR system workflow, you can evaluate the accuracy for a known set of image search results.
Retrieval System Workflow
Create image set that represents image features for retrieval. Use
imageDatastore
to store the image data. Use a large number of images that represent various viewpoints of the object. A large and diverse number of images helps train the bag of visual words and increases the accuracy of the image search.Type of feature. The
indexImages
function creates the bag of visual words using the speeded up robust features (SURF). For other types of features, you can use a custom extractor, and then usebagOfFeatures
to create the bag of visual words. See the Create Search Index Using Custom Bag of Features example.You can use the original
imgSet
or a different collection of images for the training set. To use a different collection, create the bag of visual words before creating the image index, using thebagOfFeatures
function. The advantage of using the same set of images is that the visual vocabulary is tailored to the search set. The disadvantage of this approach is that the retrieval system must relearn the visual vocabulary to use on a drastically different set of images. With an independent set, the visual vocabulary is better able to handle the additions of new images into the search index.Index the images. The
indexImages
function creates a search index that maps visual words to their occurrences in the image collection. When you create the bag of visual words using an independent or subset collection, include thebag
as an input argument toindexImages
. If you do not create an independent bag of visual words, then the function creates the bag based on the entireimgSet
input collection. You can add and remove images directly to and from the image index using theaddImages
andremoveImages
methods.Search data set for similar images. Use the
retrieveImages
function to search the image set for images which are similar to the query image. Use theNumResults
property to control the number of results. For example, to return the top 10 similar images, set theROI
property to use a smaller region of a query image. A smaller region is useful for isolating a particular object in an image that you want to search for.
Evaluate Image Retrieval
Use the evaluateImageRetrieval
function
to evaluate image retrieval by using a query image with a known set
of results. If the results are not what you expect, you can modify
or augment image features by the bag of visual words. Examine the
type of the features retrieved. The type of feature used for retrieval
depends on the type of images within the collection. For example,
if you are searching an image collection made up of scenes, such as
beaches, cities, or highways, use a global image feature. A global
image feature, such as a color histogram, captures the key elements
of the entire scene. To find specific objects within the image collections,
use local image features extracted around object keypoints instead.