Main Content

Get Started with Image Segmentation

Image segmentation is a process in image processing and computer vision that involves dividing an image into multiple segments or regions. The primary goal of image segmentation is to identify objects and boundaries in images. Image segmentation plays a key role in various applications, including scientific imaging, traffic control systems, machine vision, face and text recognition, and image editing and computer graphics. You can use image segmentation for various tasks, including:

  • Basic image processing, such as differentiating objects from the background or from each other.

  • First step in tasks that requires precise delineation and pixel-level localization of objects in an image, such as object analysis and quantification.

  • Subsequent processing step that creates boundaries or masks at the pixel level after objects have been identified, such as in instance segmentation tasks.

Which image segmentation technique you choose often depends on your specific application and the characteristics of the images to be segmented.

This table lists the techniques for image segmentation available in Image Processing Toolbox™, Computer Vision Toolbox™, and Medical Imaging Toolbox™. Each entry provides basic background information, tips for getting started, and a representative visualization.

TechniqueGet StartedVisualization of Sample Output

Segment Anything Model (SAM) – Segment images automatically and interactively.

 More About

  • Instantaneously segment objects or the entire image in the Image Segmenter app.

  • Segment entire image or a complex collection of many objects at once, with a distinct mask for each object, using imsegsam.

  • Interactively segment objects using points, masks, or ROIs that you provide by configuring a pretrained SAM as a segmentAnythingModel object.

To learn more, see Get Started with Segment Anything Model for Image Segmentation.

Use the SAM in the automatic segmentation mode to perform full image segmentation and segment distinct objects in the Image Segmenter App.

Classical image segmentation techniques – Apply semi-automated and automated image segmentation algorithms, as well as processing methods such as thresholding.

Use one of the Segmentation Techniques functions or interactive apps, such as the Image Segmenter app and Color Thresholder app, to:

  • Segment images or objects that have a very structured or textured pattern.

  • Process binary images.

  • Preprocess images for further steps, such as object analysis or deep learning tasks.

To learn more, see the Classical Image Segmentation Techniques section.

Create a binary mask using flood-fill, a classical segmentation technique.

Semantic segmentation techniques (Computer Vision Toolbox) – Train a deep learning neural network on a custom data set to segment images with complex scenes or domain-specific objects.

 More About

  • Interactively segment and label objects as ground truth using the Segment Anything Model (SAM) in the Image Labeler (Computer Vision Toolbox) app. For an example, see Automatically Label Ground Truth Using Segment Anything Model (Computer Vision Toolbox).

  • Use the labeled ground truth data to train a deep learning network for semantic segmentation, such as unet (Computer Vision Toolbox) or deeplabv3plus (Computer Vision Toolbox), to categorize every pixel in an image using class annotations.

  • After training a network, segment a test image using semanticseg (Computer Vision Toolbox).

To learn more, see Getting Started with Semantic Segmentation Using Deep Learning (Computer Vision Toolbox).

Use semantic segmentation using deep learning to train a deep learning network to segment a test image.

Instance segmentation techniques (Computer Vision Toolbox) – Train a deep learning instance segmentation model, or use a pretrained model to segment instances of objects in images.

 More About

  • Distinguish between individual objects of the same category in a test image using a pretrained Mask R-CNN or SOLOv2 network, configured by the maskrcnn (Computer Vision Toolbox) and solov2 (Computer Vision Toolbox) objects and their object functions, respectively.

  • Train a pretrained instance segmentation network, such as maskrcnn (Computer Vision Toolbox) or solov2 (Computer Vision Toolbox), on a custom data set with labeled ground truth images, and segment a test image.

To learn more, see Get Started with Instance Segmentation Using Deep Learning (Computer Vision Toolbox).

Perform instance segmentation of an image using a SOLOv2 pretrained network.

Medical image segmentation (Medical Imaging Toolbox) – Perform medical image segmentation using deep learning, a labeling app, or image processing algorithms.

  • Interactively segment objects or regions in medical images by specifying visual prompts to the Medical Segment Anything Model (MedSAM) using the medicalSegmentAnythingModel (Medical Imaging Toolbox) object.

  • Label ground truth data in a 2-D medical image or 3-D volume by using the Medical Image Labeler (Medical Imaging Toolbox) app. You can use the labeled data to train semantic segmentation deep learning networks.

  • Segment radiology images using fully automated and interactive deep learning models in the Medical Open Network for AI (MONAI) Label platform of the Medical Image Labeler (Medical Imaging Toolbox) app.

For more information, see Analysis and Applications (Medical Imaging Toolbox).

Label ground truth data using the Medical Image Labeler app.

Classical Image Segmentation Techniques

Use the classical segmentation techniques available in Image Processing Toolbox when GPU hardware resources are limited and computing speed is critical. To get started with classical techniques using the Image Segmenter app for increased interactivity, see Getting Started with Image Segmenter. Use this table to select a technique based on the characteristics of your image and application.

Image CharacteristicsRecommended Classical TechniquesExample
Clear and well-defined boundariesThreshold methods such as multithresh, otsuthresh, adaptthresh, and graythresh can easily separate objects from the background based on intensity levels.

Example of an image that has been converted to a binary mask using the adaptive threshold technique.

For an example that uses this image, see Find Threshold and Segment Bright Rice Grains from Dark Background.

Images with distinct color regionsTo create a binary mask for objects that have distinct colors that contrast significantly with the background or other objects, or objects of uniform color, use the Color Thresholder app.

Example of an image that has been segmented using the Color Thresholder app.

For an example that uses the Color Thresholder app, see Segment Image and Create Mask Using Color Thresholder.

Blurry or low-contrast boundaries

Use graph-based methods, such as the grabcut function, when the image has blurry object boundaries and a non-uniform intensity or noise distribution, at the expense of higher computational load.

Use the watershed function to separate touching objects in a noisy image at the possible expense of over-segmenting the image.

Use the activecontour function when there is clear definition of foreground objects despite blurry edges.

Example of an image that has been converted to a binary mask using GrabCut, a graph-based technique.

For an example that uses this image, see Segment Foreground from Background in Image Using Grabcut.

Overlapping or touching objectsThe watershed transform watershed can help separate objects that are touching by identifying the watershed lines between different objects based on the gradient.

Example of a binary image that has been processed using the watershed transform.

For an example that uses this image, see Compute Watershed Transform and Display Resulting Label Matrix.

Objects with similar intensity to the backgroundYou can perform clustering using the imsegkmeans, imsegkmeans3, and imsegisodata functions to segment an image into clusters based on color or intensity and spatial proximity. These methods can separate objects from the background even when their intensities are similar, but subtly distinct. Use k-means and ISODATA clustering functions to effectively partition an image or volume where distinct colors or intensities represent different segments.

Example of an image segmentation using ISODATA k-means clustering.

For an example that uses this image, see Segment 2-D Hyperspectral Image Using ISODATA Clustering.

Objects with texture or internal variationRegion-based methods, using the watershed or grayconnected functions, can help segment objects with internal variation, as they group pixels or subregions into larger regions based on predefined criteria, such as texture or color.

Example of an image segmentation using a region-based flood-fill technique.

For an example that uses this image, see Segment Image Using Flood-Fill Technique.

Large-scale imagesFor large images containing a lot of detail, you can significantly reduce computational load by using the superpixels region-based method of the superpixels or superpixels3 function.

Example of a large-scale image segmentation using superpixels and k-means clustering.

For an example that uses this image, see Plot Land Classification with Color Features and Superpixels.

Visualize Segmentation Results

Visualize segmentation results to verify the accuracy of the segmentation (how well the segmented regions match the objects of interest), interpret your segmentation results, and identify potential postprocessing steps such as mask refinement. This table describes the options for visualizing segmentation results in Image Processing Toolbox and Computer Vision Toolbox, and the corresponding functions and representative visualization.

Visualization TaskFunctionVisualization Example
Display a binary mask or image.imshow

Display a binary mask using the imshow function.

Overlay a binary mask on an image.imoverlay

Display a single binary mask overlaid on an image using the imoverlay function.

Display boundaries of segmented regions over an image.visboundaries

Display a boundaries of a segmented object over the original image using the visboundaries function.

Overlay a label matrix on an image, coloring each segmented region differently based on its label.labeloverlay

Display each segmented region in a different color based on its label in the label matrix.

Combine two images, or an image and a mask, into a single visual output.imfuse

Create a visual composite of an image and an object mask.

Combine two images into a single, cohesive, composite image. You can specify to blend only the region of the foreground image specified by the mask.imblend

Blend the region of the foreground image specified by mask on a background image using imblend.

Display a mask or a stack of masks overlaid on an image, and optionally specify a unique color for each mask.insertObjectMask (Computer Vision Toolbox)

Display a stack of masks overlaid on an image using insertObjectMask.

MATLAB® supports a wide range of colormaps and enables you to create custom visualization functions. You can tailor a visualization to the specific requirements of your application, such as highlighting particular features or ensuring that visualizations are accessible to viewers with color vision deficiencies.

Evaluate Segmentation Results

Evaluate segmentation results by using quantitative metrics to measure how well a segmentation technique performs, and ensure that it meets the specific requirements of your application. This table describes the options for evaluating segmentation results using the functions in Image Processing Toolbox and Computer Vision Toolbox.

GoalFunctionEvaluation Approach

Evaluate the overlap between the segmentation and a ground truth mask.

jaccard

Compute the Jaccard similarity coefficient, a measure of the intersection over union for the segmented result and the ground truth.

Evaluate the overlap between the segmentation and a ground truth mask, assessing the accuracy of the model in capturing the shape and size of segmented objects.

dice

Compute the Dice similarity coefficient, which is twice the area of overlap divided by the total number of pixels in both the ground truth and the segmented image.

Evaluate how accurately the boundaries of the segmented regions match the actual boundaries.

bfscore

Compute the Boundary F1 Score (BF score) between the segmented boundary and the ground truth boundary.

Evaluate the overlap between predicted and ground truth segmentations in semantic segmentation tasks that have imbalanced data sets with varying class frequencies.

generalizedDice (Computer Vision Toolbox)

Compute the weighted average of the Dice similarity coefficient across different classes, accounting for class imbalance.

Evaluate semantic segmentation results against the ground truth, classifying each pixel into one of several categories.

evaluateSemanticSegmentation (Computer Vision Toolbox)

Compare semantic segmentation results against the ground truth data by computing standard metrics, including pixel accuracy, mean IoU, and weighted IoU.

Evaluate instance segmentation results against the ground truth.

evaluateInstanceSegmentation (Computer Vision Toolbox)

Compare instance segmentation results against the ground truth data by computing standard metrics, including the confusion matrix, average precision, and precision recall.

See Also

Apps

Functions

Related Topics