segmentObjectsFromEmbeddings

Segment objects in lidar point cloud using Segment Anything Model (SAM) feature embeddings

Since R2024b

collapse all in page

Syntax

masks = segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,ForegroundPoints=pointPrompt)

masks = segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,BoundingBox=boxPrompt)

[masks,scores,maskLogits] = segmentObjectsFromEmbeddings(___)

[___] = segmentObjectsFromEmbeddings(___,Name=Value)

Description

masks = segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,ForegroundPoints=pointPrompt) segments objects from the input point cloud ptCloud using the SAM feature embeddings embeddings and the foreground point coordinates pointPrompt as a visual prompt.

masks = segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,BoundingBox=boxPrompt) segments objects from the input point cloud using the bounding box coordinates boxPrompt as a visual prompt.

example

[masks,scores,maskLogits] = segmentObjectsFromEmbeddings(___) returns the scores scores corresponding to each predicted object mask and the prediction mask logits maskLogits, using any combination of input arguments from previous syntaxes.

[___] = segmentObjectsFromEmbeddings(___,Name=Value) specifies options using one or more name-value arguments in addition to any combination of arguments from previous syntaxes. For example, ReturnMultiMask=true specifies to return multiple masks for a segmented object.

Note

This functionality requires Deep Learning Toolbox™ and the Image Processing Toolbox™ Model for Segment Anything Model support package. You can download and install the Image Processing Toolbox Model for Segment Anything Model from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

Examples

collapse all

Segment Aerial Lidar Point Cloud Using Segment Anything Model

This example uses:

Open Live Script

Specify a full file path for a LAS file that contains aerial lidar data. Then, read the point cloud data from the file using the readPointCloud function of the lasFileReader object.

filename = fullfile(matlabroot,"toolbox","lidar", ...
         "lidardata","las","aerialLidarData2.las");
lasReader = lasFileReader(filename);
ptCloud = readPointCloud(lasReader);

Remove the ground plane from the point cloud to get better segmentation results. Display the resulting point cloud.

[~,nonGroundPtCloud] = segmentGroundSMRF(ptCloud);
figure
pcshow(nonGroundPtCloud)

Create a Segment Anything Model object for aerial lidar point cloud segmentation.

samLidar = segmentAnythingAerialLidar;

Extract the feature embeddings from the point cloud.

embeddings = extractEmbeddings(samLidar,nonGroundPtCloud);

Specify a bounding box that contains an object to segment, and display it in green.

boxPrompt = [429321 3680081 79.89 14 7 3 0 0 0];
showShape("cuboid",boxPrompt,Color="green")

Segment the object using the segmentObjectsFromEmbeddings function, which runs the SAM decoder on the feature embeddings.

mask = segmentObjectsFromEmbeddings(samLidar, ...
    embeddings,nonGroundPtCloud,BoundingBox=boxPrompt);

Visualize the segmentation mask overlaid on the point cloud.

figure
pcshow(nonGroundPtCloud.Location,single(mask))

Input Arguments

collapse all

`lidarSAM` — Segment Anything Model for aerial lidar data
`segmentAnythingAerialLidar` object

Segment Anything Model for aerial lidar data, specified as a segmentAnythingAerialLidar object.

`embeddings` — Lidar point cloud embeddings
64-by-64-by-256 array

Lidar point cloud embeddings, specified as a 64-by-64-by-256 array. You can generate embeddings for a point cloud using the extractEmbeddings object function.

`ptCloud` — Unorganized point cloud
`pointCloud` object

Unorganized point cloud, specified as a pointCloud object.

`pointPrompt` — Points of object to be segmented
`[]` (default) | P-by-3 matrix

Points of the object to be segmented, or foreground points, specified as a P-by-3 matrix in which each row specifies the xyz-coordinates of a point. P is the number of points.

Note

Use at least one of these options as the visual prompts for segmentation, in addition to optional name-value arguments:

Foreground point coordinates, specified by pointPrompt.
Object bounding box coordinates, specified by boxPrompt.

`boxPrompt` — Cuboid bounding box
`[]` (default) | 1-by-9 vector

Cuboid bounding box that contains the object to be segmented, specified as a 1-by-9 vector of the form [x_ctr y_ctr z_ctr x_len y_len z_len x_rot y_rot z_rot].

x_ctr, y_ctr, and z_ctr specify the center of the cuboid.
x_len, y_len, and z_len specify the length of the cuboid along the x-, y-, and z-axes, respectively, before rotation has been applied.
x_rot, y_rot, and z_rot specify the rotation angles for the cuboid along the x-, y-, and z-axes, respectively. These angles are clockwise-positive when looking in the forward direction of their corresponding axes. Units are in degrees.

Note

Use at least one of these options as the visual prompts for segmentation, in addition to optional name-value arguments:

Foreground point coordinates, specified by pointPrompt.
Object bounding box coordinates, specified by boxPrompt.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: segmentObjectsFromEmbeddings(lidarSAM,embeddings,ptCloud,ForegroundPoints=pointPrompt,ReturnMultiMask=true) specifies to return multiple masks for a segmented object.

`BackgroundPoints` — Background points
`[]` (default) | Q-by-3 matrix

Background points that are not part of the object to be segmented, specified as a Q-by-3 matrix in which each row specifies the xyz-coordinates of a point. Q is the number of points. Use this argument as an additional visual prompt to foreground points or bounding boxes.

`MaskLogits` — Mask prediction logits
`[]` (default) | 256-by-256-by-1 numeric array

Mask prediction logits, specified by the value of maskLogits from the previous output of the segmentObjectsFromEmbeddings function. Specify the MaskLogits argument to refine an existing mask.

`ReturnMultiMask` — Multiple segmentation masks
`false` or `0` (default) | `true` or `1`

Multiple segmentation masks, specified as a numeric or logical 0 (false) or 1 (true). Specify ReturnMultiMask as true to return three masks in place of the default single mask.

Use this argument to return three masks when you use ambiguous visual prompts, such as single points. You can choose one or a combination of the resulting masks to capture different subregions of the object.

`ExecutionEnvironment` — Hardware resource
`"auto"` (default) | `"gpu"` | `"cpu"`

Hardware resource on which to process data in the input network, specified as "auto", "gpu", or "cpu".

"auto" — Use a GPU, if available. Otherwise, use the CPU.
"gpu" — Use the GPU. To use a GPU, you must have Parallel Computing Toolbox™ and a CUDA^® enabled NVIDIA^® GPU. If a suitable GPU is not available, the function returns an error. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).
"cpu" — Use the CPU.

Output Arguments

collapse all

`masks` — Object masks
M-by-1 logical vector | M-by-1-by-3 logical array

Object masks, returned as one of these options:

M-by-1 logical vector — ReturnMultiMask is false. M is the number of points in the input point cloud.
M-by-1-by-3 logical array — ReturnMultiMask is true. M is the number of points in the input point cloud.

`scores` — Prediction confidence scores
numeric scalar | 3-by-1 numeric vector

Prediction confidence scores for the segmentation, returned as one of these options:

Numeric scalar — ReturnMultiMask is false.
3-by-1 numeric vector — ReturnMultiMask is true.

`maskLogits` — Mask prediction logits
256-by-256 numeric matrix | 256-by-256-by-3 numeric array

Mask prediction logits, returned as one of these options:

256-by-256 numeric matrix — ReturnMultiMask value is false.
256-by-256-by-3 numeric array — ReturnMultiMask value is true.

To refine the output mask, you can specify this value to the MaskLogits name-value argument on subsequent segmentObjectsFromEmbeddings function calls.

Version History

Introduced in R2024b

segmentObjectsFromEmbeddings

Syntax

Description

Examples

Segment Aerial Lidar Point Cloud Using Segment Anything Model

Input Arguments

lidarSAM — Segment Anything Model for aerial lidar data segmentAnythingAerialLidar object

embeddings — Lidar point cloud embeddings 64-by-64-by-256 array

ptCloud — Unorganized point cloud pointCloud object

pointPrompt — Points of object to be segmented [] (default) | P-by-3 matrix

boxPrompt — Cuboid bounding box [] (default) | 1-by-9 vector

Name-Value Arguments

BackgroundPoints — Background points [] (default) | Q-by-3 matrix

MaskLogits — Mask prediction logits [] (default) | 256-by-256-by-1 numeric array

ReturnMultiMask — Multiple segmentation masks false or 0 (default) | true or 1

ExecutionEnvironment — Hardware resource "auto" (default) | "gpu" | "cpu"

Output Arguments

masks — Object masks M-by-1 logical vector | M-by-1-by-3 logical array

scores — Prediction confidence scores numeric scalar | 3-by-1 numeric vector

maskLogits — Mask prediction logits 256-by-256 numeric matrix | 256-by-256-by-3 numeric array

Version History

See Also

`lidarSAM` — Segment Anything Model for aerial lidar data
`segmentAnythingAerialLidar` object

`embeddings` — Lidar point cloud embeddings
64-by-64-by-256 array

`ptCloud` — Unorganized point cloud
`pointCloud` object

`pointPrompt` — Points of object to be segmented
`[]` (default) | P-by-3 matrix

`boxPrompt` — Cuboid bounding box
`[]` (default) | 1-by-9 vector

`BackgroundPoints` — Background points
`[]` (default) | Q-by-3 matrix

`MaskLogits` — Mask prediction logits
`[]` (default) | 256-by-256-by-1 numeric array

`ReturnMultiMask` — Multiple segmentation masks
`false` or `0` (default) | `true` or `1`

`ExecutionEnvironment` — Hardware resource
`"auto"` (default) | `"gpu"` | `"cpu"`

`masks` — Object masks
M-by-1 logical vector | M-by-1-by-3 logical array

`scores` — Prediction confidence scores
numeric scalar | 3-by-1 numeric vector

`maskLogits` — Mask prediction logits
256-by-256 numeric matrix | 256-by-256-by-3 numeric array