Automate Ground Truth Labeling Across Multiple Signals

This example uses:

This example shows how to automate the labeling of multiple signals simultaneously by using the Ground Truth Labeler app and the AutomationAlgorithm interface. The automation algorithm used in this example estimates the label positions of vehicles in point cloud frames based on the label positions of vehicles in corresponding image frames using camera-to-lidar calibration parameters.

The Ground Truth Labeler App

Good ground truth data is crucial for developing driving algorithms and evaluating their performances. However, creating a rich and diverse set of annotated driving data requires significant time and resources. The Ground Truth Labeler app makes this process efficient. You can use this app as a fully manual annotation tool to mark lane boundaries, vehicle bounding boxes, and other objects of interest for a vision system. However, manual labeling requires a significant amount of time and resources. This app also provides a framework to create algorithms to extend and automate the labeling process. You can create and use the algorithms to quickly label entire data sets, and then follow it up with a more efficient, shorter manual verification step. You can also edit the results of the automation step to account for challenging scenarios that the automation algorithm might have missed.

This example describes creating an algorithm that can be used in the Ground Truth Labeler app to automatically detect vehicles in the image and estimate their positions in the corresponding point cloud using camera-to-lidar calibration parameters.

Detect Vehicles Using ACF Vehicle Detector

To detect the vehicles in images, the automation algorithm uses a pretrained aggregate channel features (ACF) vehicle detector, vehicleDetectorACF. Preview how the algorithm works by loading a sample image and the ACF vehicle detector, detecting vehicles in the image, and inserting 2-D bounding boxes around the vehicles in the image.

% Load the data from the MAT file and extract the image.
data = load(fullfile(toolboxdir('lidar'),'lidardata','lcc','bboxGT.mat'));
I = data.im;

% Load the pretrained detector for vehicles.
detector = vehicleDetectorACF('front-rear-view');

% Detect vehicles and show the bounding boxes.
[imBboxes,~] = detect(detector, I);
Iout = insertShape(I,'rectangle',imBboxes,'LineWidth',4);
figure
imshow(Iout)
title('Detected Vehicles')

Figure contains an axes object. The axes object with title Detected Vehicles contains an object of type image.

If you have camera calibration information available, you can improve this detector by filtering out false positives from the detections. The Visual Perception Using Monocular Camera example describes how to create a pretrained vehicle detector and configure it to detect vehicle bounding boxes using the calibrated monocular camera configuration.

Estimate 3-D Bounding Box for Vehicles in Point Cloud

To estimate vehicles in the point cloud frames from the corresponding detected vehicles in the image frames, the algorithm uses the bboxCameraToLidar (Lidar Toolbox) function. This function uses lidar-to-camera calibration parameters to estimate 3-D bounding boxes based on 2-D bounding boxes. To estimate the bounding boxes, the function takes as input the intrinsic camera parameters, cameraIntrinsics, and a camera-to-lidar rigid transformation, rigid3d.

Preview how the algorithm works by loading the point cloud corresponding to the image, estimating the 3-D bounding boxes of vehicles in the point cloud, and inserting the bounding boxes around the vehicles in the point cloud.

% Extract the point cloud.   
ptCloud = data.pc;

% Extract the intrinsic camera parameters.
intrinsics = data.cameraParams;

% Extract the camera-to-lidar rigid transformation.
tform = data.camToLidar;
               
% Estimate the bounding boxes in the point cloud.
pcBboxes = bboxCameraToLidar(imBboxes, ptCloud, intrinsics, tform);

% Display bounding boxes in the point cloud.
figure
ax = pcshow(ptCloud.Location);
showShape('cuboid',pcBboxes,'Parent',ax,'Opacity',0.1,'Color',[0.06 1.00 1.00],'LineWidth',0.5)
hold on
zoom(ax,1.5)
title('Estimated Bounding Box in Point Cloud')
hold off

Figure contains an axes object. The axes object with title Estimated Bounding Box in Point Cloud contains an object of type scatter.

Integrate Multisignal Vehicle Detector Algorithm Into Ground Truth Labeler

To incorporate the multisignal vehicle detector algorithm into the automation workflow of the Ground Truth Labeler app, construct a class that inherits from the abstract base class, vision.labeler.AutomationAlgorithm. This base class defines properties and signatures for methods that the app uses for configuring and running the custom algorithm. The Ground Truth Labeler app provides a convenient way to obtain an initial automation class template. For details, see Create Automation Algorithm for Labeling. The MultiSignalVehicleDetector class is based on this template and provides you with a ready-to-use automation class for vehicle detection in image and vehicle bounding box estimation in the point cloud. The comments of the class outline the basic steps needed to implement each API call.

Step 1 contains properties that define the name and description of the algorithm and the directions for using the algorithm.

    % ----------------------------------------------------------------------
    % Step 1: Define the properties required for describing the algorithm,
    % which include Name, Description, and UserDirections.
    properties(Constant)
        
        % Name Algorithm name
        %   Character vector specifying the name of the algorithm.
        Name = 'Multisignal Vehicle Detector';
        
        % Description Algorithm description
        %   Character vector specifying the short description of the algorithm.
        Description = ['Detect vehicles using ACF Vehicle Detector in ' ...
            'image and estimate them in point cloud.'];
        
        % UserDirections Algorithm usage directions
        %   Cell array of character vectors specifying directions for
        %   algorithm users to follow.
        UserDirections = {['Select one of the rectangle ROI labels to ' ...
            'label objects as Vehicle.'], ...
            ['Click Settings and on the Lidar Camera Calibration ' ...
            'Parameters tab, load the cameraIntrinsics and rigid3d ' ...
            'objects from the workspace.'], ...
            ['Specify additional parameters under Settings.'], ...
            ['Click Run to detect vehicles in each image and point cloud.'], ...
            ['Review automated labels manually. You can modify, delete ', ...
            'and add new labels.'], ...
            ['If you are not satisfied with the results, click Undo ' ...
            'Run. Click Settings to modify algorithm settings and click ', ...
            'Run again.'] ...
            ['When you are satisfied with the results, click Accept and ', ...
            'return to manual labeling.']};
    end

Step 2 contains the custom properties for the core algorithm.

    % ---------------------------------------------------------------------
    % Step 2: Define properties to be used to manage algorithm execution.
    properties
        
        % SelectedLabelName Selected label name
        %   Name of the selected label. Vehicles detected by the algorithm will
        %   be assigned this variable name.
        SelectedLabelName
        
        % Detector Detector
        %   Pretrained vehicle detector, an object of class
        %   acfObjectDetector.
        Detector
        
        % VehicleModelName Vehicle detector model name
        %   Name of pretrained vehicle detector model.
        VehicleModelName = 'full-view';
        
        % OverlapThreshold Overlap threshold
        %   Threshold value used to eliminate overlapping bounding boxes
        %   around the reference bounding box, between 0 and 1. The
        %   bounding box overlap ratio denominator, 'RatioType', is set to
        %   'Min'.
        OverlapThreshold = 0.45;
        
        % ScoreThreshold Classification score threshold
        %   Threshold value used to reject detections with low detection
        %   scores.
        ScoreThreshold = 20;
        
        % ConfigureDetector Detection configuration flag
        %   Boolean value that determines whether the detector is 
        %   configured using monoCamera sensor.
        ConfigureDetector = false;
        
        % SensorObj monoCamera sensor
        %   Monocular camera sensor object, monoCamera, used to configure
        %   the detector. A configured detector runs faster and can 
        %   potentially result in better detections.
        SensorObj = [];
        
        % SensorStr monoCamera sensor variable name
        %   Character vector specifying the monoCamera object variable name 
        %   used to configure the detector.
        SensorStr = '';
        
        % VehicleWidth Vehicle width
        %   Vehicle width used to configure the detector, specified as
        %   [minWidth, maxWidth], which describes the approximate width of the
        %   object in world units.
        VehicleWidth = [1.5 2.5];
        
        % VehicleLength Vehicle length
        %   Vehicle length used to configure the detector, specified as
        %   [minLength, maxLength] vector, which describes the approximate
        %   length of the object in world units.
        VehicleLength = [];  
        
        % IntrinsicsObj Camera intrinsics
        %   cameraIntrinsics object, which represents a projective
        %   transformation from camera to image coordinates.
        IntrinsicsObj = [];
        
        % IntrinsicsStr cameraIntrinsics variable name
        %   cameraIntrinsics object variable name.
        IntrinsicsStr = '';
        
        % ExtrinsicsObj Camera-to-lidar rigid transformation
        %   rigid3d object representing the 3-D rigid geometric transformation 
        %   from the camera to the lidar.
        ExtrinsicsObj = [];
        
        % ExtrinsicsStr rigid3d variable name
        %   Camera-to-lidar rigid3d object variable name.
        ExtrinsicsStr = '';
        
        % ClusterThreshold Clustering threshold for two adjacent points
        %   Threshold specifying the maximum distance between two adjacent points
        %   for those points to belong to the same cluster.
        ClusterThreshold = 1;
        
    end

Step 3 deals with function definitions.

The first function, supportsMultisignalAutomation, checks that the algorithm supports multiple signals. For the multisignal vehicle detector, you load both image and point cloud signals, so success is set to true.

        function success = supportsMultisignalAutomation(~)
            % Supports MultiSignal.
            success = true;
        end

The next function, checkSignalType, checks that only signals of the appropriate type are supported for automation. The multisignal vehicle detector must support signals of type Image and PointCloud, so this version of the function checks for both signal types.

        
        function isValid = checkSignalType(signalType)
            % Only video/image sequence and point cloud signal data 
            % is valid. 
            isValid = any(signalType == vision.labeler.loading.SignalType.Image) && ...
               any(signalType == vision.labeler.loading.SignalType.PointCloud);  
        end

The next function, checkLabelDefinition, checks that only labels of the appropriate type are enabled for automation. For vehicle detection in image and point cloud signals, you check that only labels of type Rectangle/Cuboid are enabled, so this version of the function checks the Type of the labels.

        function isValid = checkLabelDefinition(~, labelDef)            
            % Only Rectangular/Cuboid ROI Label definitions are valid for the
            % Vehicle Detector.
            isValid = (labelDef.Type == labelType.Cuboid || labelDef.Type == labelType.Rectangle);
        end

The next function, checkSetup, checks that only one ROI label definition is selected to automate.

        function isReady = checkSetup(algObj, ~)
            % Is there one selected ROI Label definition to automate?
            isReady = ~isempty(algObj.SelectedLabelDefinitions);
        end

Next, the settingsDialog function obtains and modifies the properties defined in step 2. This API call lets you create a dialog box that opens when a user clicks the Settings button in the Automate tab. To create this dialog box, use the dialog function to create a modal window to ask the user to specify the cameraIntrinsics object and rigid3d object. The multiSignalVehicleDetectorSettings method contains the code for settings and also adds input validation steps.

        function settingsDialog(algObj)
            % Invoke dialog box to input camera intrinsics and
            % camera-to-lidar rigid transformation and options for choosing
            % a pretrained model, overlap threshold, detection score
            % threshold, and clustering threshold. Optionally, input a
            % calibrated monoCamera sensor to configure the detector.
            multiSignalVehicleDetectorSettings(algObj);
        end

Step 4 specifies the execution functions. The initialize function populates the initial algorithm state based on the existing labels in the app. In the MultiSignalVehicleDetector class, the initialize function has been customized to store the name of the selected label definition and to load the pretrained ACF vehicle detector and save it to the Detector property.

       function initialize(algObj, ~)
            
            % Store the name of the selected label definition. Use this
            % name to label the detected vehicles.
            algObj.SelectedLabelName = algObj.SelectedLabelDefinitions.Name;
            
            % Initialize the vehicle detector with a pretrained model.
            algObj.Detector = vehicleDetectorACF(algObj.VehicleModelName);
        end

Next, the run function defines the core vehicle detection algorithm of this automation class. The run function is called for each frame of the image and point cloud sequence and expects the automation class to return a set of labels. The run function in MultiSignalVehicleDetector contains the logic described previously for detecting 2-D vehicle bounding boxes in image frames and estimating 3-D vehicle bounding boxes in point cloud frames.

       function autoLabels = run(algObj, I)
            % autoLabels a cell array of length the same as the number of 
            %  signals.
            autoLabels = cell(size(I,1),1);
            
            % Get the index of Image and PointCloud frames.
            if isa(I{1,1},"pointCloud")
                pcIdx = 1;
                imIdx = 2;
            else
                imIdx = 1;
                pcIdx = 2;
            end
            
            % Detect bounding boxes on image frame.
            selectedBboxes = detectVehicle(algObj, I{imIdx,1});
            
            % Estimate bounding boxes on point cloud frame.
            if ~isempty(selectedBboxes)
                
                % Store labels from the image. 
                imageLabels = struct('Type', labelType.Rectangle, ...
                'Name', algObj.SelectedLabelDefinitions.Name, ...
                'Position', selectedBboxes);
                autoLabels{imIdx, 1} = imageLabels;
                
                % Remove the ground plane for the point cloud.
                groundPtsIndex = segmentGroundFromLidarData(I{pcIdx,1}, ...
                    "ElevationAngleDelta", 15, "InitialElevationAngle", 10);

                nonGroundPts = select(I{pcIdx,1}, ~groundPtsIndex);
                
                % Predict 3-D bounding boxes.
                pcBboxes = bboxCameraToLidar(selectedBboxes, nonGroundPts, algObj.IntrinsicsObj, ...
                    algObj.ExtrinsicsObj, "ClusterThreshold", algObj.ClusterThreshold);
                
                % Store labels from the point cloud.
                if(~isempty(pcBboxes))
                    pcLabels = struct('Type', labelType.Cuboid,...
                    'Name', algObj.SelectedLabelDefinitions.Name,...
                    'Position', pcBboxes);
                    autoLabels{pcIdx, 1} = pcLabels;
                else
                    autoLabels{pcIdx, 1} = {};
                end
            else
                autoLabels{imIdx, 1} = {};
                autoLabels{pcIdx, 1} = {};
            end                           
        end

Finally, the terminate function handles any cleanup or tear-down required after the automation is done. This algorithm does not require any cleanup, so the function is empty.

       function terminate(~)
       end

Use Multisignal Vehicle Detector Automation Class in App

The properties and methods described in the previous section are implemented in the MultiSignalVehicleDetector automation algorithm class file. To use this class in the app:

Create the folder structure +vision/+labeler required under the current folder, and copy the automation class into it.

Note: The MultiSignalVehicleDetector.m file must be in the same folder where you create the +vision/+labeler folder structure.

    mkdir('+vision/+labeler');
    copyfile('MultiSignalVehicleDetector.m','+vision/+labeler');

Download the point cloud sequence (PCD) and image sequence. For illustration purposes, this example uses WPI lidar data collected on a highway from an Ouster OS1 lidar sensor and WPI image data from a front-facing camera mounted on an ego vehicle. Execute the following code block to download and save lidar and image data in a temporary folder. Depending on your Internet connection, the download process can take some time. The code suspends MATLAB® execution until the download process is complete. Alternatively, you can download the data set to your local disk using your web browser and extract the file.

Download the image sequence to a temporary location.

    imageURL = 'https://www.mathworks.com/supportfiles/lidar/data/WPI_ImageData.tar.gz';
    imageDataFolder = fullfile(tempdir, 'WPI_ImageData',filesep);
    imageDataTarFile = imageDataFolder + "WPI_ImageData.tar.gz";

    if ~exist(imageDataFolder,'dir')
        mkdir(imageDataFolder)
    end

    if ~exist(imageDataTarFile, 'file')
        disp('Downloading WPI Image driving data (225 MB)...');
        websave(imageDataTarFile, imageURL);
        untar(imageDataTarFile, imageDataFolder);
    end
    
    % Check if image tar.gz file is downloaded, but not uncompressed.
    if ~exist(fullfile(imageDataFolder,'imageData'),'dir')
        untar(imageDataTarFile, imageDataFolder)
    end

For illustration purposes, this example uses only a subset of the WPI image sequence, from frames 920–940. To load the subset of images into the app, copy the images into a folder.

    % Create new folder and copy the images.
    imDataFolder = imageDataFolder + "imageDataSequence";
    if ~exist(imDataFolder,'dir')
        mkdir(imDataFolder);
    end

    for i = 920 : 940
        filename = strcat(num2str(i,'%06.0f'),'.jpg');
        source = fullfile(imageDataFolder,'imageData',filename);
        destination = fullfile(imageDataFolder,'imageDataSequence',filename);
        copyfile(source,destination)
    end

Download the point cloud sequence to a temporary location.

    lidarURL = 'https://www.mathworks.com/supportfiles/lidar/data/WPI_LidarData.tar.gz';
    lidarDataFolder = fullfile(tempdir,'WPI_LidarData',filesep);        
    lidarDataTarFile = lidarDataFolder + "WPI_LidarData.tar.gz";

    if ~exist(lidarDataFolder)
        mkdir(lidarDataFolder)
    end

    if ~exist(lidarDataTarFile, 'file')       
        disp('Downloading WPI Lidar driving data (760 MB)...');
        websave(lidarDataTarFile,lidarURL);
        untar(lidarDataTarFile,lidarDataFolder);
    end
    
    % Check if lidar tar.gz file is downloaded, but not uncompressed.
    if ~exist(fullfile(lidarDataFolder,'WPI_LidarData.mat'),'file')
        untar(lidarDataTarFile,lidarDataFolder);
    end

The Ground Truth Labeler app supports the loading of point cloud sequences composed of PCD or PLY files. Save the downloaded point cloud data to PCD files. For illustration purposes, in this example, you save only a subset of the WPI point cloud data, from frames 920–940.

    % Load downloaded lidar data into the workspace.
    load(fullfile(lidarDataFolder,'WPI_LidarData.mat'),'lidarData');
    lidarData = reshape(lidarData,size(lidarData,2),1);
    
    % Create new folder and write lidar data to PCD files.
    pcdDataFolder = lidarDataFolder + "lidarDataSequence";
    if ~exist(pcdDataFolder, 'dir')
        mkdir(fullfile(lidarDataFolder,'lidarDataSequence'));
    end

    disp('Saving WPI Lidar driving data to PCD files ...');
    for i = 920:940
        filename = strcat(fullfile(lidarDataFolder,'lidarDataSequence',filesep), ...
            num2str(i,'%06.0f'),'.pcd');
        pcwrite(lidarData{i},filename);
    end

Calibration information is expected to be in the form of intrinsic and extrinsic (rigid transformation) parameters as mentioned in Lidar and Camera Calibration (Lidar Toolbox). Load camera intrinsics, which are stored in a cameraIntrinsics object, and the camera-to-lidar rigid transformation, which is stored in a rigid3d object, to the workspace. The WPI data in this example is calibrated and the intrinsic and extrinsic (camera-to-lidar transformation) parameters are saved in the MAT file.

    data = load(fullfile(toolboxdir('lidar'),'lidardata','lcc','bboxGT.mat'));
    cameraParams = data.cameraParams;
    camToLidar = data.camToLidar;

Open the Ground Truth Labeler app.

    imageDir = fullfile(tempdir, 'WPI_ImageData', 'imageDataSequence');
    pointCloudDir = fullfile(tempdir, 'WPI_LidarData', 'lidarDataSequence');

    groundTruthLabeler

On the app toolstrip, select Import and then Add Signals. In the Add/Remove Signal window, load the image sequence.

Set Source Type to Image Sequence.
Browse for the image sequence folder, which is at the location specified by the imageDir variable.
Use the default timestamps and click Add Source. The image sequence folder, imageDataSequence, is added to the signal source table.

On the app toolstrip, select Import and then Add Signals. In the Add/Remove Signal window, load the point cloud sequence.

Set Source Type to Point Cloud Sequence.
Browse for the point cloud sequence folder, which is at the location specified by the pointCloudDir variable.
Use the default timestamps and click Add Source. The point cloud sequence folder, lidarDataSequence, is added to the signal source table.

Click OK to import the signals into the app. To view the signals side by side, select the Visualization tab, click Grid in the Layout section , and display the signals in a 1-by-2 grid.

In the Label Definition section on the app toolstrip under Ground Truth Labeler tab, click Add Label, select Rectangle/Cuboid from the dropdown and define an ROI label with a name of Vehicle.Optionally, select a color, and then click OK.

Select both signals for automation. On the app toolstrip under Ground Truth Labeler tab in the Automate Labeling section, click on Select Algorithm and then Select Signals, and select both signals. Click OK.

Under Select Algorithm, select Refresh list. Then, select Algorithm and then Multisignal Vehicle Detector. If you do not see this option, verify that the current working folder has a folder called +vision/+labeler, with a file named MultiSignalVehicleDetector.m in it.

Click Automate. The app opens an automation session for the selected signals and displays directions for using the algorithm.

Load the intrinsic camera parameters into the automation session.

On the Automate tab, click Settings.
On the Lidar-to-Camera Calibration Parameters tab, click Import camera intrinsics from workspace.
Import the intrinsic camera parameters, cameraParams, from the MATLAB workspace. Click OK.

Load the camera-to-lidar transformation into the automation session.

On the Lidar-to-Camera Calibration parameters tab, click Import camera-to-lidar transformation from workspace.
Import the transformation, camToLidar, from the MATLAB workspace. Click OK.

Modify additional vehicle detector settings as needed and click OK. Then, on the Automate tab, click Run. The created algorithm executes on each frame of the sequence and detects vehicles by using the Vehicle label type. After the app completes the automation run, use the slider or arrow keys to scroll through the sequence to locate frames where the automation algorithm labeled incorrectly. Manually tweak the results by adjusting the detected bounding boxes or adding new bounding boxes.

Once you are satisfied with the detected vehicle bounding boxes for the entire sequence, click Accept. You can then continue to manually adjust labels or export the labeled ground truth to the MATLAB workspace.

You can use the concepts described in this example to create your own custom multisignal automation algorithms and extend the functionality of the app.

Automate Ground Truth Labeling Across Multiple Signals

The Ground Truth Labeler App

Detect Vehicles Using ACF Vehicle Detector

Estimate 3-D Bounding Box for Vehicles in Point Cloud

Integrate Multisignal Vehicle Detector Algorithm Into Ground Truth Labeler

Use Multisignal Vehicle Detector Automation Class in App

See Also

Apps

Functions

Objects

Classes

Topics