rgbdvslam

Feature-based visual simultaneous localization and mapping (vSLAM) with RGB-D camera

Since R2024a

Description

Use the rgbdvslam object to perform visual simultaneous localization and mapping (vSLAM) with RGB-D camera data. RGB-D vSLAM combines depth information from sensors, such as RGB-D cameras or depth sensors, with RGB images to simultaneously estimate the camera pose and create a map of the environment. To learn more about visual SLAM, see Implement Visual SLAM in MATLAB.

The rgbdvslam object extracts Oriented FAST and Rotated BRIEF (ORB) features from incrementally read images, and then tracks those features to estimate camera poses, identify key frames, and reconstruct a 3-D environment. The vSLAM algorithm also searches for loop closures using the bag-of-features algorithm, and then optimizes the camera poses using pose graph optimization.

Note

To use the rgbdvslam object, you must have a Navigation Toolbox™ license.

Creation

Syntax

vslam = rgbdvslam(intrinsics)

vslam = rgbdvslam(intrinsics,depthScaleFactor)

vslam = rgbdvslam(intrinsics,Name=Value)

Description

vslam = rgbdvslam(intrinsics) creates an RGB-D visual SLAM object, vslam, by using the specified camera intrinsic parameters.

The rgbdvslam object assumes the color and the depth images have been preregistered with one-to-one correspondence.

The object represents 3-D map points and camera poses in world coordinates, and assumes the camera pose of the first key frame is an identity rigidtform3d transform.

Note

The rgbdvslam object runs on multiple threads internally, which can delay the processing of an image frame added by using the addFrame function. Additionally, the object running on multiple threads means the current frame the object is processing can be different than the recently added frame.

vslam = rgbdvslam(intrinsics,depthScaleFactor) specifies the depth correction factor of the RGB-D camera, which the camera manufacturer usually provides. Use this syntax when the depth scale factor for the sensor is not equal to 1.

vslam = rgbdvslam(intrinsics,Name=Value) sets properties using one or more name-value arguments. For example, MaxNumPoints=850 sets the maximum number of ORB feature points to extract from each image to 850.

example

Input Arguments

expand all

`intrinsics` — Camera intrinsic parameters
`cameraIntrinsics` object

Camera intrinsic parameters, specified as a cameraIntrinsics object.

`depthScaleFactor` — Depth scale factor
scalar

Depth scale factor, specified as a scalar in real-world units, such as meters. The depth scale factor is the conversion factor that relates the depth values of the depth sensor to real-world distances, and is typically expressed in the same units as the depth measurements provided by the sensor, such as millimeters, centimeters, or meters. This value provides the necessary information to transform the depth measurements into the metric scale. Use the depthScaleFactor argument when the value for the sensor you are using is not equal to 1.

For the world 3-D coordinates (X, Y, Z), where Z is the depth at any pixel coordinate (u, v), Z = P/depthScaleFactor, where P represents the intensity value of the depth image at pixel (u, v).

Properties

expand all

`DepthRange` — Depth range of RGB-D camera
`[0.5 5]` (default) | two-element vector

This property is read-only.

Depth range of the RGB-D camera, specified as a 2-element vector, in world units. The range specifies the minimum and maximum depth values of the camera, which you can use to filter out invalid depth values in the depth images. You must set this property at object creation.

`DepthScaleFactor` — Depth scale factor
scalar

This property is read-only.

Depth scale factor, specified as a scalar in real-world units (e.g., meters). The depth scale factor depends on the depth sensor, and it is typically expressed in the same unit as the depth measurements provided by the sensor (e.g., millimeters, centimeters, or meters). It is the conversion factor that relates the depth values to real-world distances. In other words, it provides the necessary information to transform the depth measurements into metric scale.

For the world 3-D coordinates (X,Y,Z), where Z represents depth at any pixel coordinate (u,v), Z would be computed as Z = P/depthScaleFactor, where P represents the intensity value of the depth image at pixel (u,v). You must set this property at object creation.

`ScaleFactor` — Scale factor for image decomposition
`1.2` (default) | scalar greater than `1`

This property is read-only.

Scale factor for image decomposition, stored as a scalar greater than 1. The scale factor is also referred to as the pyramid decimation ratio. Increasing the value of ScaleFactor reduces the number of pyramid levels, and computation time, but at the cost of tracking performance. Decreasing this value, increases the number of pyramid levels, which can improve tracking performance, at the cost of computation speed. The scale value at each level of decomposition is ScaleFactor^{(level –
1)}, where level is any value in the range [0, NumLevels-1]. Given an input image of size M-by-N, the image size at each level of composition is M_level-by-N_level, where:

$\begin{array}{l} M_{l e v e l} = \frac{M}{S c a l e F a c t o r^{(l e v e l - 1)}} \\ N_{l e v e l} = \frac{N}{S c a l e F a c t o r^{(l e v e l - 1)}} \end{array}$

You must set this property at object creation.

`NumLevels` — Number of decomposition levels
`8` (default) | scalar greater than or equal to `1`

This property is read-only.

Number of decomposition levels, specified as a scalar greater than or equal to 1. Increase this value to extract keypoints from the image at more levels of decomposition. Along with the ScaleFactor value, NumLevels controls the number of pyramid levels on which the object evaluates feature points.

The image size at each decomposition level limits the number of levels at which you can extract keypoints. The image size at a level of decomposition must be at least 63-by-63 for keypoint detection. The maximum level of decomposition is:

$l e v e l_{\max} = floor (\frac{\log (\min (M, N)) - \log (63)}{\log (ScaleFactor)}) + 1$

If either the default value or the specified value of NumLevels is greater than level_max, the object reduces NumLevels to level_max and returns a warning. You must set this property at object creation.

`MaxNumPoints` — Maximum number of ORB feature points uniformly extracted from each image
`1000` | positive integer

This property is read-only.

Maximum number of ORB feature points uniformly extracted from each image, specified as a positive integer. Values are typically in the range of [800, 2000], depending on the resolution of the image. When the number of extracted features is less than the value of MaxNumPoints, then the object uses all feature points. You must set this property at object creation.

`TrackFeatureRange` — Range for the number of feature points to identify key frame
`[30 100]` | 2-element vector

This property is read-only.

Key frame feature point range, stored as a two-element vector of the form [lowerLimit upperLimit]. This property specifies the minimum and maximum numbers of tracked feature points a frame must contain for the object to identify it as a key frame. The TrackFeatureRange and the SkipMaxFrames properties enable you to control the frequency at which frames in the tracking process become key frames. You must set this property at object creation.

The success of tracking depends on the number of tracked points in the current frame, with one of these results:

Tracking is lost — The number of tracked feature points in the current frame is less than the lowerLimit set by the TrackFeatureRange argument. This indicates that the image does not contain enough features, or that the camera is moving too fast. If the object does not accept enough frames as key frames, to improve tracking, you can increase the upperLimit value of the TrackFeatureRange property and decrease the SkipMaxFrames property to add key frames more frequently.
Tracking is successful — The object identifies the current frame as a key frame. The number of tracked feature points in the current frame is in the range set by TrackFeatureRange.
Tracking adds key frames too frequently — The number of tracked feature points in the current frame is greater than the upperLimit set by the TrackFeatureRange property. This indicates that the camera is moving very slowly, which produces an unnecessary number of key frames. To reduce the frequency at which the object adds key frames and improve tracking, you can increase the value of the SkipMaxFrames argument.

For more details, see the addFrame object function.

`SkipMaxFrames` — Maximum number of skipped frames
`20` | positive integer

This property is read-only.

Maximum number of skipped frames, stored as a positive integer. When the number of tracked features is consistently greater than the upperLimit set by the TrackFeatureRange property, use the SkipMaxFrames property to control the frequency at which the object adds new key frames. The object identifies the current frame as a key frame when the number of skipped frames since the most recently added key frame equals the value of SkipMaxFrames. You must set this property at object creation.

`LoopClosureThreshold` — Minimum number of matched feature points between loop closure key frames
`60` | positive integer

This property is read-only.

Minimum number of matched feature points between loop closure key frames, stored as a positive integer. You must set this property at object creation.

`CustomBagOfFeatures` — Custom bag of features for loop detection
`[]` (default) | `bagOfFeaturesDBoW` object

Custom bag of features for loop detection, specified as a bagOfFeaturesDBoW object. The bagOfFeaturesDBoW enables you to create a custom bag of words (BoW) from feature descriptors, alongside options to utilize a built-in vocabulary or load a custom one from a specified file.

`Verbose` — Progress information display
`[]` (default) | `1` | `2` | `3`

Progress information display, specified as [], 1, 2, or 3. Paths to the location of log files, when they are created, is displayed on the command line.

Verbose value	Display description	Display location
`[]` or `false`	Display is turned off
`1` or `true`	Stages of vSLAM execution	Command window
`2`	Stages of vSLAM execution, with details on how the frame is processed, such as the artifacts used to initialize the map.	Log file in a temporary folder
`3`	Stages of vSLAM, artifacts used to initialize the map, poses and map points before and after bundle adjustment, and loop closure optimization data.	Log file in a temporary folder

Object Functions

`addFrame`	Add pair of color and depth images to RGB-D visual SLAM object
`hasNewKeyFrame`	Check if new key frame added in RGB-D visual SLAM object
`checkStatus`	Check status of visual RGB-D SLAM object
`isDone`	End-of-processing status for RGB-D visual SLAM object
`mapPoints`	Build 3-D map of world points from RGB-D vSLAM object
`poses`	Absolute camera poses of RGB-D vSLAM key frames
`plot`	Plot 3-D map points and estimated camera trajectory in RGB-D visual SLAM
`reset`	Reset RGB-D visual SLAM object

Examples

collapse all

RGB-D Visual SLAM Using TUM RGB-D Data Set

This example uses:

Open Live Script

Perform RGB-D visual simultaneous localization and mapping (vSLAM) using the data from the TUM RGB-D Benchmark. You can download the data to a temporary directory using a web browser or by running this code:

baseDownloadURL = "https://vision.in.tum.de/rgbd/dataset/freiburg3/rgbd_dataset_freiburg3_long_office_household.tgz"; 
dataFolder = fullfile(tempdir,"tum_rgbd_dataset",filesep); 
options = weboptions(Timeout=Inf);
tgzFileName = dataFolder+"fr3_office.tgz";
folderExists = exist(dataFolder,"dir");

% Create a folder in a temporary directory to save the downloaded file
if ~folderExists  
    mkdir(dataFolder) 
    disp("Downloading fr3_office.tgz (1.38 GB). This download can take a few minutes.") 
    websave(tgzFileName,baseDownloadURL,options); 
    
    % Extract contents of the downloaded file
    disp("Extracting fr3_office.tgz (1.38 GB) ...") 
    untar(tgzFileName,dataFolder); 
end

Create two imageDatastore objects. One to store the color images and the other to store the depth images.

colorImageFolder = dataFolder+"rgbd_dataset_freiburg3_long_office_household/rgb/";
depthImageFolder = dataFolder+"rgbd_dataset_freiburg3_long_office_household/depth/";

imdsColor = imageDatastore(colorImageFolder);
imdsDepth = imageDatastore(depthImageFolder);

Select the synchronized pair of color and depth images.

data = load("rgbDepthPairs.mat");
imdsColor=subset(imdsColor, data.indexPairs(:, 1));
imdsDepth=subset(imdsDepth, data.indexPairs(:, 2));

Specify your camera intrinsic parameters, and use them to create an RGB-D visual SLAM object.

intrinsics = cameraIntrinsics([535.4 539.2],[320.1 247.6],[480 640]);
depthScaleFactor = 5000;
vslam = rgbdvslam(intrinsics,depthScaleFactor);

Process each pair of color and depth images, and visualize the camera poses and 3-D map points.

for i = 1:numel(imdsColor.Files)
    colorImage = readimage(imdsColor,i);
    depthImage = readimage(imdsDepth,i);
    addFrame(vslam,colorImage,depthImage);

    if hasNewKeyFrame(vslam)
        % Query 3-D map points and camera poses
        xyzPoints = mapPoints(vslam);
        [camPoses,viewIds] = poses(vslam);

        % Display 3-D map points and camera trajectory
        plot(vslam);
    end

    % Get current status of system
    status = checkStatus(vslam);
    
    % Stop adding frames when tracking is lost
    if status == uint8(0)
        break
    end
end

Once all the frames have been processed, reset the system.

while ~isDone(vslam)
    plot(vslam);
end
reset(vslam);

References

[1] Mur-Artal, Raul, J. M. M. Montiel, and Juan D. Tardos. “ORB-SLAM: A Versatile and Accurate Monocular SLAM System.” IEEE Transactions on Robotics 31, no. 5 (October 2015): 1147–63. https://doi.org/10.1109/TRO.2015.2463671.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

To compile the generated code for non-host target, these dependencies must be installed:

OpenCV 4.7.0
g2o Library — Download the g2o library.
eigen3 Library

Version History

Introduced in R2024a

expand all

R2024b: Add ability to specify custom bag of words

Use the CustomBagOfFeatures name-value argument to specify a custom bag of words vocabulary as a bagOfFeaturesDBoW object.

R2024b: `Verbose` property enhanced to display progress information

The Verbose property added values of 1, 2, and 3 to display levels of progress information.

rgbdvslam

Description

Creation

Syntax

Description

Input Arguments

`intrinsics` — Camera intrinsic parameters
`cameraIntrinsics` object

`depthScaleFactor` — Depth scale factor
scalar

Properties

`DepthRange` — Depth range of RGB-D camera
`[0.5 5]` (default) | two-element vector

`DepthScaleFactor` — Depth scale factor
scalar

`ScaleFactor` — Scale factor for image decomposition
`1.2` (default) | scalar greater than `1`

`NumLevels` — Number of decomposition levels
`8` (default) | scalar greater than or equal to `1`

`MaxNumPoints` — Maximum number of ORB feature points uniformly extracted from each image
`1000` | positive integer

`TrackFeatureRange` — Range for the number of feature points to identify key frame
`[30 100]` | 2-element vector

`SkipMaxFrames` — Maximum number of skipped frames
`20` | positive integer

`LoopClosureThreshold` — Minimum number of matched feature points between loop closure key frames
`60` | positive integer

`CustomBagOfFeatures` — Custom bag of features for loop detection
`[]` (default) | `bagOfFeaturesDBoW` object

`Verbose` — Progress information display
`[]` (default) | `1` | `2` | `3`

Object Functions

Examples

RGB-D Visual SLAM Using TUM RGB-D Data Set

References

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Version History

R2024b: Add ability to specify custom bag of words

R2024b: `Verbose` property enhanced to display progress information

See Also

Objects

Functions

Topics

External Websites

rgbdvslam

Description

Creation

Syntax

Description

Input Arguments

intrinsics — Camera intrinsic parameters cameraIntrinsics object

depthScaleFactor — Depth scale factor scalar

Properties

DepthRange — Depth range of RGB-D camera [0.5 5] (default) | two-element vector

DepthScaleFactor — Depth scale factor scalar

ScaleFactor — Scale factor for image decomposition 1.2 (default) | scalar greater than 1

NumLevels — Number of decomposition levels 8 (default) | scalar greater than or equal to 1

MaxNumPoints — Maximum number of ORB feature points uniformly extracted from each image 1000 | positive integer

TrackFeatureRange — Range for the number of feature points to identify key frame [30 100] | 2-element vector

SkipMaxFrames — Maximum number of skipped frames 20 | positive integer

LoopClosureThreshold — Minimum number of matched feature points between loop closure key frames 60 | positive integer

CustomBagOfFeatures — Custom bag of features for loop detection [] (default) | bagOfFeaturesDBoW object

Verbose — Progress information display [] (default) | 1 | 2 | 3

Object Functions

Examples

RGB-D Visual SLAM Using TUM RGB-D Data Set

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Version History

R2024b: Add ability to specify custom bag of words

R2024b: Verbose property enhanced to display progress information

See Also

Objects

Functions

Topics

External Websites

`intrinsics` — Camera intrinsic parameters
`cameraIntrinsics` object

`depthScaleFactor` — Depth scale factor
scalar

`DepthRange` — Depth range of RGB-D camera
`[0.5 5]` (default) | two-element vector

`DepthScaleFactor` — Depth scale factor
scalar

`ScaleFactor` — Scale factor for image decomposition
`1.2` (default) | scalar greater than `1`

`NumLevels` — Number of decomposition levels
`8` (default) | scalar greater than or equal to `1`

`MaxNumPoints` — Maximum number of ORB feature points uniformly extracted from each image
`1000` | positive integer

`TrackFeatureRange` — Range for the number of feature points to identify key frame
`[30 100]` | 2-element vector

`SkipMaxFrames` — Maximum number of skipped frames
`20` | positive integer

`LoopClosureThreshold` — Minimum number of matched feature points between loop closure key frames
`60` | positive integer

`CustomBagOfFeatures` — Custom bag of features for loop detection
`[]` (default) | `bagOfFeaturesDBoW` object

`Verbose` — Progress information display
`[]` (default) | `1` | `2` | `3`

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

R2024b: `Verbose` property enhanced to display progress information