Main Content

segmentObjectsFromEmbeddings

Segment objects in lidar point cloud using Segment Anything Model (SAM) feature embeddings

Since R2024b

    Description

    masks = segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,ForegroundPoints=pointPrompt) segments objects from the input point cloud ptCloud using the SAM feature embeddings embeddings and the foreground point coordinates pointPrompt as a visual prompt.

    masks = segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,BoundingBox=boxPrompt) segments objects from the input point cloud using the bounding box coordinates boxPrompt as a visual prompt.

    example

    [masks,scores,maskLogits] = segmentObjectsFromEmbeddings(___) returns the scores scores corresponding to each predicted object mask and the prediction mask logits maskLogits, using any combination of input arguments from previous syntaxes.

    [___] = segmentObjectsFromEmbeddings(___,Name=Value) specifies options using one or more name-value arguments in addition to any combination of arguments from previous syntaxes. For example, ReturnMultiMask=true specifies to return multiple masks for a segmented object.

    Note

    This functionality requires Deep Learning Toolbox™ and the Image Processing Toolbox™ Model for Segment Anything Model support package. You can download and install the Image Processing Toolbox Model for Segment Anything Model from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

    Examples

    collapse all

    Specify a full file path for a LAS file that contains aerial lidar data. Then, read the point cloud data from the file using the readPointCloud function of the lasFileReader object.

    filename = fullfile(matlabroot,"toolbox","lidar", ...
             "lidardata","las","aerialLidarData2.las");
    lasReader = lasFileReader(filename);
    ptCloud = readPointCloud(lasReader);

    Remove the ground plane from the point cloud to get better segmentation results. Display the resulting point cloud.

    [~,nonGroundPtCloud] = segmentGroundSMRF(ptCloud);
    figure
    pcshow(nonGroundPtCloud)

    Create a Segment Anything Model object for aerial lidar point cloud segmentation.

    samLidar = segmentAnythingAerialLidar;

    Extract the feature embeddings from the point cloud.

    embeddings = extractEmbeddings(samLidar,nonGroundPtCloud);

    Specify a bounding box that contains an object to segment, and display it in green.

    boxPrompt = [429321 3680081 79.89 14 7 3 0 0 0];
    showShape("cuboid",boxPrompt,Color="green")

    Segment the object using the segmentObjectsFromEmbeddings function, which runs the SAM decoder on the feature embeddings.

    mask = segmentObjectsFromEmbeddings(samLidar, ...
        embeddings,nonGroundPtCloud,BoundingBox=boxPrompt);

    Visualize the segmentation mask overlaid on the point cloud.

    figure
    pcshow(nonGroundPtCloud.Location,single(mask))

    Input Arguments

    collapse all

    Segment Anything Model for aerial lidar data, specified as a segmentAnythingAerialLidar object.

    Lidar point cloud embeddings, specified as a 64-by-64-by-256 array. You can generate embeddings for a point cloud using the extractEmbeddings object function.

    Unorganized point cloud, specified as a pointCloud object.

    Points of the object to be segmented, or foreground points, specified as a P-by-3 matrix in which each row specifies the xyz-coordinates of a point. P is the number of points.

    Note

    Use at least one of these options as the visual prompts for segmentation, in addition to optional name-value arguments:

    • Foreground point coordinates, specified by pointPrompt.

    • Object bounding box coordinates, specified by boxPrompt.

    Cuboid bounding box that contains the object to be segmented, specified as a 1-by-9 vector of the form [xctr yctr zctr xlen ylen zlen xrot yrot zrot].

    • xctr, yctr, and zctr specify the center of the cuboid.

    • xlen, ylen, and zlen specify the length of the cuboid along the x-, y-, and z-axes, respectively, before rotation has been applied.

    • xrot, yrot, and zrot specify the rotation angles for the cuboid along the x-, y-, and z-axes, respectively. These angles are clockwise-positive when looking in the forward direction of their corresponding axes. Units are in degrees.

    Note

    Use at least one of these options as the visual prompts for segmentation, in addition to optional name-value arguments:

    • Foreground point coordinates, specified by pointPrompt.

    • Object bounding box coordinates, specified by boxPrompt.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,ForegroundPoints=pointPrompt,ReturnMultiMask=true) specifies to return multiple masks for a segmented object.

    Background points that are not part of the object to be segmented, specified as a Q-by-3 matrix in which each row specifies the xyz-coordinates of a point. Q is the number of points. Use this argument as an additional visual prompt to foreground points or bounding boxes.

    Mask prediction logits, specified by the value of maskLogits from the previous output of the segmentObjectsFromEmbeddings function. Specify the MaskLogits argument to refine an existing mask.

    Multiple segmentation masks, specified as a numeric or logical 0 (false) or 1 (true). Specify ReturnMultiMask as true to return three masks in place of the default single mask.

    Use this argument to return three masks when you use ambiguous visual prompts, such as single points. You can choose one or a combination of the resulting masks to capture different subregions of the object.

    Hardware resource on which to process data in the input network, specified as "auto", "gpu", or "cpu".

    • "auto" — Use a GPU, if available. Otherwise, use the CPU.

    • "gpu" — Use the GPU. To use a GPU, you must have Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. If a suitable GPU is not available, the function returns an error. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).

    • "cpu" — Use the CPU.

    Output Arguments

    collapse all

    Object masks, returned as one of these options:

    • M-by-1 logical vector — ReturnMultiMask is false. M is the number of points in the input point cloud.

    • M-by-1-by-3 logical array — ReturnMultiMask is true. M is the number of points in the input point cloud.

    Prediction confidence scores for the segmentation, returned as one of these options:

    • Numeric scalar — ReturnMultiMask is false.

    • 3-by-1 numeric vector — ReturnMultiMask is true.

    Mask prediction logits, returned as one of these options:

    • 256-by-256 numeric matrix — ReturnMultiMask value is false.

    • 256-by-256-by-3 numeric array — ReturnMultiMask value is true.

    To refine the output mask, you can specify this value to the MaskLogits name-value argument on subsequent segmentObjectsFromEmbeddings function calls.

    Version History

    Introduced in R2024b