Main Content

ROILabelData

Ground truth data for ROI labels

Since R2020a

Description

The ROILabelData object stores ground truth data for region of interest (ROI) label definitions for each signal in a groundTruthMultisignal object.

Creation

When you export a groundTruthMultisignal object from a Ground Truth Labeler app session, the ROILabelData property of the exported object stores the ROI labels as an ROILabelData object. To create an ROILabelData object programmatically, use the vision.labeler.labeldata.ROILabelData function (described here).

Description

roiLabelData = vision.labeler.labeldata.ROILabelData(signalNames,labelData) creates an object containing ROI label data for multiple signals. The created object, roiLabelData, contains properties with the signal names listed in signalNames. These properties store the corresponding ROI label data specified by labelData.

example

Input Arguments

expand all

Signal names, specified as a string array. Specify the names of all signals present in the groundTruthMultisignal object you are creating. You can get the signal names from an existing groundTruthMultisignal object by accessing the DataSource property of that object. Use this command and replace gTruth with the name of your groundTruthMultisignal object variable.

gTruth.DataSource.SignalName

In an exported groundTruthMultisignal object, the ROILabelData object contains a label data property for each signal, even if some signals do not have ROI label data.

The properties of the created ROILabelData object have the names specified by signalNames.

Example: ["video_01_city_c2s_fcw_10s" "lidarSequence"]

ROI label data for each signal, specified as a cell array of timetables. Each timetable in the cell array contains data for the signal in the corresponding position of the signalNames input. The ROILabelData object stores each timetable in a property that has the same name as that signal.

The timetable format for each signal depends on data from the groundTruthMultisignal object that you exported or are creating.

Each timetable contains one column per label definition stored in the LabelDefinitions property of the groundTruthMultisignal object. Label definitions that the signal type does not support are excluded. For example, suppose you define a Line ROI label named 'lane'. The timetable for a lidar point cloud signal does not include a lane column, because these signals do not support Line ROI labels. In the DataSource property of the groundTruthMultisignal object, the SignalType property of each data source lists the valid signal types.

The height of the timetable is defined by the number of timestamps in the signal. In the DataSource property of the groundTruthMultisignal object, the Timestamp property of each data source lists the signal timestamps.

For each label definition, all ROI labels marked at that timestamps are combined into a single cell in the table. Consider the ROI label data for a video signal stored in a groundTruthMultisignal object, gTruth. At each timestamp, car contains three labels, truck contains one label, and lane contains two labels.

gTruth.ROILabelData.video_01_city_c2s_fcw_10s
ans =

  5×4 timetable

      Time           car            truck            lane    
    _________    ____________    ____________    ____________
    0 sec        {3×4 double}    {1×4 double}    {2×1 cell  }
    0.05 sec     {3×4 double}    {1×4 double}    {2×1 cell  }
    0.1 sec      {3×4 double}    {1×4 double}    {2×1 cell  }
    0.15 sec     {3×4 double}    {1×4 double}    {2×1 cell  }
    0.2 sec      {3×4 double}    {1×4 double}    {2×1 cell  }

The storage format for ROI label data depends on the label type.

Label TypeStorage Format for Labels at Each Timestamp
labelType.Rectangle

M-by-4 numeric matrix of the form [x, y, w, h], where:

  • M is the number of labels in the frame.

  • x and y specify the upper-left corner of the rectangle.

  • w specifies the width of the rectangle, which is its length along the x-axis.

  • h specifies the height of the rectangle, which is its length along the y-axis.

labelType.RotatedRectangle

For one or more rotated rectangles, specify in spatial coordinates as an M-by-5 numeric matrix, where each row specifies a rotated rectangle of the form [xctr yctr w h yaw].

  • M is the number of rotated rectangles.

  • xctr and yctr specify the center of the rectangle.

  • w specifies the width of the rectangle, which is its length along the x-axis before rotation.

  • h specifies the height of the rectangle, which is its length along the y-axis before rotation.

  • yaw specifies the rotation angle in degrees. The rotation is clockwise-positive around the center of the rectangle.

labelType.Cuboid

M-by-9 numeric matrix with rows of the form [xctr, yctr, zctr, xlen, ylen, zlen, xrot, yrot, zrot], where:

  • M is the number of labels in the frame.

  • xctr, yctr, and zctr specify the center of the cuboid.

  • xlen, ylen, and zlen specify the length of the cuboid along the x-axis, y-axis, and z-axis, respectively, before rotation has been applied.

  • xrot, yrot, and zrot specify the rotation angles for the cuboid along the x-axis, y-axis, and z-axis, respectively. These angles are clockwise-positive when looking in the forward direction of their corresponding axes.

The figure shows how these values determine the position of a cuboid.

Cuboid with center point, lengths, and rotation angles labeled

labelType.ProjectedCuboid

M-by-8 vector of the form [x1, y1, w1, h1, x2, y2, w2, h2], where:

  • M is the number of labels in the frame.

  • x1, y1 specifies the x,y coordinates for the upper-left location of the front-face of the projected cuboid

  • w1 specifies the width for the front-face of the projected cuboid.

  • h1 specifies the height for the front-face of the projected cuboid.

  • x2, y2 specifies the x,y coordinates for the upper-left location of the back-face of the projected cuboid.

  • w2 specifies the width for the back-face of the projected cuboid.

  • h2 specifies the height for the back-face of the projected cuboid.

The figure shows how these values determine the position of a cuboid.

Labeled projected cuboid

labelType.Line

M-by-1 vector of cell arrays, where M is the number of labels in the frame. Each cell array contains an N-by-2 numeric matrix of the form [x1 y1; x2 y2; ... ; xN yN] for N points in the polyline.

labelType.PixelLabel

Label data for all pixel label definitions is stored in a single M-by-1 PixelLabelData column for M images or frames. Each element contains a filename for a pixel label image. A pixel label image describes the label or labels contained in the corresponding image. The labels can be described as a 1- or 3- channel label matrix. To use PixelLabelData with any of the labeler apps, you must use a single-channel label matrix, where the values are of type uint8. You can convert a 3-channel pixel label data matrix to a single-channel label matrix programmatically to use with the labeler apps.

labelType.Polygon

M-by-1 vector of cell arrays, where M is the number of labels. Each cell array contains an N-by-2 numeric matrix of the form [x1 y1; x2 y2; ... ; xN yN] for N points in the polygon.

labelType.Custom

Labels are stored exactly as they are specified in the timetable. If you import a groundTruthMultisignal object containing custom label data into the Ground Truth Labeler app, this data is not imported into the app. Use custom data when gathering label data for training and combining it with data labeled in the app.

If the ROI label data includes sublabels or attributes, then the labels at each timestamp must be specified as structures instead. The structure includes these fields.

Label Structure FieldDescription
Position

Positions of the parent labels at the given timestamp

The format of Position depends on the label type. These formats are described in the previous table.

AttributeName1,...,AttributeNameN

Attributes of the parent labels

Each defined sublabel has its own field, where the name of the field corresponds to the attribute name. The attribute value is a character vector for a List or String attribute, a numeric scalar for a Numeric attribute, or a logical scalar for a Logical attribute. If the attribute is unspecified, then the attribute value is an empty vector.

SublabelName1,...,SublabelNameN

Sublabels of the parent labels

Each defined sublabel has its own field, where the name of the field corresponds to the sublabel name. The value of each sublabel field is a structure containing the data for all marked sublabels with that name at the given timestamp.

This table describes the format of this sublabel structure.

Sublabel Structure FieldDescription
Position

Positions of the sublabels at the given timestamp

The format of Position depends on the label type. These formats are described in the previous table.

AttributeName1,...,AttributeNameN

Attributes of the sublabels

Each defined sublabel has its own field, where the name of the field corresponds to the attribute name. The attribute value is a character vector for a List or String attribute, a numeric scalar for a Numeric attribute, or a logical scalar for a Logical attribute. If you leave an attribute unspecified, then the attribute value is an empty vector.

Properties

expand all

ROI label data, specified as timetables. The ROILabelData object contains one property per signal, where each property contains a timetable of ROI label data corresponding to that signal.

When exporting an ROILabelData object from a Ground Truth Labeler app session, the property names correspond to the signal names stored in the DataSource property of the exported groundTruthMultisignal object.

When creating an ROILabelData object programmatically, the signalNames and labelData input arguments define the property names and values of the created object.

Suppose you want to create a groundTruthMultisignal object containing a video signal and a lidar point cloud sequence signal. Specify the signals in a string array, signalNames.

signalNames = ["video_01_city_c2s_fcw_10s" "lidarSequence"];

Store the video ROI labels, videoData, and lidar point cloud sequence ROI labels, lidarData, in a cell array of timetables, labelData. Each timetable contains the data for the corresponding signal in signalNames.

labelData = {videoData,lidarData}
  1×2 cell array

    {204×2 timetable}    {34×1 timetable}

The ROILabelData object, roiData, stores this data in the property with the corresponding signal name. You can specify roiData in the ROILabelData property of a groundTruthMultisignal object.

roiData = vision.labeler.labeldata.ROILabelData(signalNames,labelData)
roiData = 

  ROILabelData with properties:

    video_01_city_c2s_fcw_10s: [204×2 timetable]
                lidarSequence: [34×1 timetable]

Examples

collapse all

Create ground truth data for a video signal and a lidar point cloud sequence signal that captures the same driving scene. Specify the signal sources, label definitions, and ROI and scene label data.

Create the video data source from an MP4 file.

sourceName = '01_city_c2s_fcw_10s.mp4';
sourceParams = [];
vidSource = vision.labeler.loading.VideoSource;
vidSource.loadSource(sourceName,sourceParams);

Create the point cloud sequence source from a folder of point cloud data (PCD) files.

pcSeqFolder = fullfile(toolboxdir('driving'),'drivingdata','lidarSequence');
addpath(pcSeqFolder)
load timestamps.mat
rmpath(pcSeqFolder)

lidarSourceData = load(fullfile(pcSeqFolder,'timestamps.mat'));

sourceName = pcSeqFolder;
sourceParams = struct;
sourceParams.Timestamps = timestamps;

pcseqSource = vision.labeler.loading.PointCloudSequenceSource;
pcseqSource.loadSource(sourceName,sourceParams);

Combine the signal sources into an array.

dataSource = [vidSource pcseqSource]
dataSource = 

  1x2 heterogeneous MultiSignalSource (VideoSource, PointCloudSequenceSource) array with properties:

    SourceName
    SourceParams
    SignalName
    SignalType
    Timestamp
    NumSignals

Create a table of label definitions for the ground truth data by using a labelDefinitionCreatorMultisignal object.

  • The Car label definition appears twice. Even though Car is defined as a rectangle, you can draw rectangles only for image signals, such as videos. The labelDefinitionCreatorMultisignal object creates an additional row for lidar point cloud signals. In these signal types, you can draw Car labels as cuboids only.

  • The label definitions have no descriptions and no assigned colors, so the Description and LabelColor columns are empty.

  • The label definitions have no assigned groups, so for all label definitions, the corresponding cell in the Group column is set to 'None'.

  • Road is a pixel label definition, so the table includes a PixelLabelID column.

  • No label definitions have sublabels or attributes, so the table does not include a Hierarchy column for storing such information.

ldc = labelDefinitionCreatorMultisignal;
addLabel(ldc,'Car','Rectangle');
addLabel(ldc,'Truck','ProjectedCuboid');
addLabel(ldc,'Lane','Line');
addLabel(ldc,'Road','PixelLabel');
addLabel(ldc,'Sunny','Scene');
labelDefs = create(ldc)
labelDefs =

  7x7 table

      Name       SignalType       LabelType        Group      Description    LabelColor    PixelLabelID
    _________    __________    _______________    ________    ___________    __________    ____________

    {'Car'  }    Image         Rectangle          {'None'}       {' '}       {0x0 char}    {0x0 double}
    {'Car'  }    PointCloud    Cuboid             {'None'}       {' '}       {0x0 char}    {0x0 double}
    {'Truck'}    Image         ProjectedCuboid    {'None'}       {' '}       {0x0 char}    {0x0 double}
    {'Lane' }    Image         Line               {'None'}       {' '}       {0x0 char}    {0x0 double}
    {'Lane' }    PointCloud    Line               {'None'}       {' '}       {0x0 char}    {0x0 double}
    {'Road' }    Image         PixelLabel         {'None'}       {' '}       {0x0 char}    {[       1]}
    {'Sunny'}    Time          Scene              {'None'}       {' '}       {0x0 char}    {0x0 double}

Create ROI label data for the first frame of the video.

numVideoFrames = numel(vidSource.Timestamp{1});
carData = cell(numVideoFrames,1);
laneData = cell(numVideoFrames,1);
truckData = cell(numVideoFrames,1);
carData{1} = [304 212 37 33];
laneData{1} = [70 458; 311 261];
truckData{1} = [309,215,33,24,330,211,33,24];
videoData = timetable(vidSource.Timestamp{1},carData,laneData, ...
                      'VariableNames',{'Car','Lane'});

Create ROI label data for the first point cloud in the sequence.

numPCFrames = numel(pcseqSource.Timestamp{1});
carData = cell(numPCFrames, 1);
carData{1} = [27.35 18.32 -0.11 4.25 4.75 3.45 0 0 0];
lidarData = timetable(pcseqSource.Timestamp{1},carData,'VariableNames',{'Car'});

Combine the ROI label data for both sources.

signalNames = [dataSource.SignalName];
roiData = vision.labeler.labeldata.ROILabelData(signalNames,{videoData,lidarData})
roiData = 

  ROILabelData with properties:

    video_01_city_c2s_fcw_10s: [204x2 timetable]
                lidarSequence: [34x1 timetable]

Create scene label data for the first 10 seconds of the driving scene.

sunnyData = seconds([0 10]);
labelNames = ["Sunny"];
sceneData = vision.labeler.labeldata.SceneLabelData(labelNames,{sunnyData})
sceneData = 

  SceneLabelData with properties:

    Sunny: [0 sec    10 sec]

Create a ground truth object from the signal sources, label definitions, and ROI and scene label data. You can import this object into the Ground Truth Labeler app for manual labeling or to run a labeling automation algorithm on it. You can also extract training data from this object for deep learning models by using the gatherLabelData function.

gTruth = groundTruthMultisignal(dataSource,labelDefs,roiData,sceneData)
gTruth = 

  groundTruthMultisignal with properties:

          DataSource: [1x2 vision.labeler.loading.MultiSignalSource]
    LabelDefinitions: [7x7 table]
        ROILabelData: [1x1 vision.labeler.labeldata.ROILabelData]
      SceneLabelData: [1x1 vision.labeler.labeldata.SceneLabelData]

Version History

Introduced in R2020a