Computer Vision

Extend deep learning workflows with computer vision applications

Apply deep learning to computer vision applications by using Deep Learning Toolbox™ together with the Computer Vision Toolbox™.

Apps

Image Labeler	Label images for computer vision applications
Video Labeler	Label video for computer vision applications
Object Detector Analyzer	Interactively visualize and evaluate object detection results against ground truth (Since R2026a)

Functions

expand all

Vision-Language Models

`clipNetwork`	Create pretrained CLIP deep learning neural network for vision-language tasks (Since R2026a)
`moondream`	Create pretrained Moondream vision-language model (VLM) (Since R2026a)
`groundingDinoObjectDetector`	Detect and localize objects using Grounding DINO object detector (Since R2026a)

ViT (Vision Transformer)

`visionTransformer`	Pretrained vision transformer (ViT) neural network (Since R2023b)
`patchEmbeddingLayer`	Patch embedding layer (Since R2023b)

Semantic Segmentation

`bisenetv2`	Create BiSeNet v2 convolutional neural network for semantic segmentation (Since R2025a)
`semanticseg`	Semantic image segmentation using deep learning
`unet`	Create U-Net convolutional neural network for semantic segmentation (Since R2024a)
`unet3d`	Create 3-D U-Net convolutional neural network for semantic segmentation of volumetric images (Since R2024a)
`deeplabv3plus`	Create DeepLab v3+ convolutional neural network for semantic image segmentation (Since R2024a)

Object Detection

`rtmdetObjectDetector`	Detect objects using RTMDet object detector (Since R2024b)
`yolov4ObjectDetector`	Detect objects using YOLO v4 object detector (Since R2022a)
`yolov2ObjectDetector`	Detect objects using YOLO v2 object detector
`yolov3ObjectDetector`	Detect objects using YOLO v3 object detector
`hrnetObjectKeypointDetector`	Create object keypoint detector using HRNet deep learning network (Since R2023b)
`ssdObjectDetector`	Detect objects using SSD deep learning detector

Instance Segmentation and Pose Estimation

`solov2`	Segment objects using SOLOv2 instance segmentation network (Since R2023b)
`maskrcnn`	Detect objects using Mask R-CNN instance segmentation (Since R2021b)
`posemaskrcnn`	Predict object pose using Pose Mask R-CNN pose estimation (Since R2024a)

Object Tracking and Re-Identification

reidentificationNetwork Re-identification deep learning network for re-identifying and tracking objects (Since R2024a)

Automated Visual Inspection

`yoloxObjectDetector`	Detect objects using YOLOX object detector (Since R2023b)
`efficientADAnomalyDetector`	Detect anomalies using EfficientAD network (Since R2024b)
`patchCoreAnomalyDetector`	Detect anomalies using PatchCore network (Since R2023a)
`fcddAnomalyDetector`	Detect anomalies using fully convolutional data description (FCDD) network for anomaly detection (Since R2022b)
`fastFlowAnomalyDetector`	Detect anomalies using FastFlow network (Since R2023a)

Text Detection and Recognition

`detectTextCRAFT`	Detect texts in images by using CRAFT deep learning model (Since R2022a)
`ocr`	Recognize text using optical character recognition

Topics

Object Detection and Instance Segmentation

Get Started with Object Detection Using Deep Learning (Computer Vision Toolbox)
Perform object detection using deep learning neural networks such as YOLOX, YOLO v4, RTMDet, and SSD.
Get Started with Instance Segmentation Using Deep Learning (Computer Vision Toolbox)
Segment objects using an instance segmentation model such as SOLOv2 or Mask R-CNN.
Choose an Object Detector (Computer Vision Toolbox)
Compare object detection deep learning models, such as YOLOX, YOLO v4, RTMDet, and SSD.
Augment Bounding Boxes for Object Detection (Computer Vision Toolbox)
This example shows how to perform common kinds of image and bounding box augmentation as part of object detection workflows.
Import Pretrained ONNX YOLO v2 Object Detector
This example shows how to import a pretrained ONNX™ (Open Neural Network Exchange) you only look once (YOLO) v2 [1] object detection network and use the network to detect objects.
Export YOLO v2 Object Detector to ONNX
This example shows how to export a YOLO v2 object detection network to ONNX™ (Open Neural Network Exchange) model format.
Deploy Object Detection Model as Microservice (MATLAB Compiler SDK)
This example shows how to create a microservice Docker^® image from a MATLAB^® object detection model.

Automated Visual Inspection

Getting Started with Anomaly Detection Using Deep Learning (Computer Vision Toolbox)
Anomaly detection using deep learning is an increasingly popular approach to automating visual inspection tasks.
Detect Image Anomalies Using Explainable FCDD Network (Computer Vision Toolbox)
Use an anomaly detector to distinguish between normal pills and pills with anomalous chips or contamination.
Localize Industrial Defects Using PatchCore Anomaly Detector (Computer Vision Toolbox)
Perform localization of anomalous defects in printed circuit boards (PCBs) using anomaly heat maps generated with the PatchCore anomaly detector.
Classify Defects on Wafer Maps Using Deep Learning (Computer Vision Toolbox)
Classify manufacturing defects on wafer maps using a simple convolutional neural network (CNN).
Detect Image Anomalies Using Pretrained ResNet-18 Feature Embeddings (Computer Vision Toolbox)
Train a similarity-based anomaly detector using one-class learning of feature embeddings extracted from a pretrained ResNet-18 convolutional neural network.

Semantic Segmentation

Get Started with Semantic Segmentation Using Deep Learning (Computer Vision Toolbox)
Segment objects by class using deep learning networks such as U-Net and DeepLab v3+.
Augment Pixel Labels for Semantic Segmentation (Computer Vision Toolbox)
This example shows how to perform common kinds of image and pixel label augmentation as part of semantic segmentation workflows.
Semantic Segmentation Using Dilated Convolutions
This example shows how to train a semantic segmentation network using dilated convolutions.
Semantic Segmentation of Multispectral Images Using Deep Learning (Computer Vision Toolbox)
This example shows how to perform semantic segmentation of a multispectral image with seven channels using U-Net.
Explore Semantic Segmentation Network Using Grad-CAM
This example shows how to explore the predictions of a pretrained semantic segmentation network using Grad-CAM.
Generate Adversarial Examples for Semantic Segmentation (Computer Vision Toolbox)
Generate adversarial examples for a semantic segmentation network using the basic iterative method (BIM).
Prune and Quantize Semantic Segmentation Network
Reduce the memory footprint of a semantic segmentation network and speed-up inference by compressing the network using pruning and quantization.

Image and Video Classification

Train Vision Transformer Network for Image Classification
This example shows how to fine-tune a pretrained vision transformer (ViT) neural network to perform classification on a new collection of images.
Human Activity Recognition Using R(2+1)D Video Classification (Computer Vision Toolbox)
Train an R(2+1)D video classifier for activity recognition.
Gesture Recognition using Videos and Deep Learning (Computer Vision Toolbox)
Train a SlowFast convolutional neural network for gesture recognition using RGB data from videos.

Featured Examples

New

Automatically Label Ground Truth Using Vision-Language Model

Automatically label ground truth images for object detection using the Grounding DINO vision-language model (VLM).

(Computer Vision Toolbox)

Since R2026a

New

Detect Industrial Defects Using Zero-Shot AnomalyCLIP

Detect and localize industrial production defects in pill images using an AnomalyCLIP anomaly detection network.

(Computer Vision Toolbox)

Since R2026a

Identify Defects in Air Compressors Using Spectrogram Images

Detect and localize defects in acoustic recordings of air compressors using Mel spectrogram images and an EfficientAD anomaly detector.

(Computer Vision Toolbox)

Since R2025a

Detect Small Objects Using Tiled Training of YOLOX Network

Detect small objects in full-resolution images using tiled training of a you only look once version X (YOLOX) deep learning network.

(Computer Vision Toolbox)

Since R2024b

Automatically Label Ground Truth Using Segment Anything Model

Produce pixel labels for semantic segmentation using the Segment Anything Model (SAM) in the Image Labeler app. The SAM is an automatic segmentation technique that you can use to segment object regions to label with just a few clicks, or automatically segment the entire image and instantaneously create labels for selected regions. In this example, you interactively label pixels for semantic segmentation in two ways.

(Computer Vision Toolbox)

Since R2024b

Multiclass Object Detection Using YOLO v2 Deep Learning

Train a YOLO v2 multiclass object detector and evaluate object detector performance across selected classes and overlap thresholds.

(Computer Vision Toolbox)

Since R2024b

Detect Defects Using Tiled Training of EfficientAD Anomaly Detector

Detect and localize defects on anomalous chewing gum images by training an EfficientAD anomaly detection network on tiled normal images.

(Computer Vision Toolbox)

Since R2024b

Localize Industrial Defects Using PatchCore Anomaly Detector

Perform localization of anomalous defects in printed circuit boards (PCBs) using anomaly heat maps generated with the PatchCore anomaly detector.

(Computer Vision Toolbox)

Perform 6-DoF Pose Estimation for Bin Picking Using Deep Learning

Perform six degrees-of-freedom (6-DoF) pose estimation by estimating the 3-D position and orientation of machine parts in a bin using RGB-D images and a deep learning network.

(Computer Vision Toolbox)