Aerial Lidar Semantic Segmentation Using RandLANet Deep Learning
This example shows how to train a RandLANet deep learning network to perform semantic segmentation on aerial lidar data.
You can use lidar data acquired from airborne laser scanning systems in applications such as topographic mapping, city modeling, biomass measurement, and disaster management. Extracting meaningful information from this data requires semantic segmentation, a process in which you assign each point in the point cloud a unique class label.
In this example, you train a RandLANet [1] network to perform semantic segmentation by using the Dayton Annotated Lidar Earth Scan (DALES) data set [2]. The data set contains scenes of dense, labeled aerial lidar data from urban, suburban, rural, and commercial settings. The data set provides semantic segmentation labels for eight classes: buildings, cars, trucks, poles, power lines, fences, ground, and vegetation.
Load DALES Data
The DALES data set contains 40 scenes of aerial lidar data. Out of the 40 scenes, 29 scenes are used for training and the remaining 11 scenes are used for testing. Follow the instructions on the DALES website to download the data set to the folder specified by the dataFolder
variable. Create folders to store the training and test data.
dataFolder = fullfile(tempdir,"DALES"); trainDataFolder = fullfile(dataFolder,"dales_las","train"); testDataFolder = fullfile(dataFolder,"dales_las","test");
Preview a point cloud from the training data.
lasReader = lasFileReader(fullfile(trainDataFolder,"5080_54435.las")); [pc,attr] = readPointCloud(lasReader,"Attributes","Classification"); labels = attr.Classification; % Select only labeled data. pc = select(pc,labels~=0); labels = labels(labels~=0); classNames = ["ground"; ... "vegetation"; ... "cars"; ... "trucks"; ... "powerlines"; ... "fences"; ... "poles"; ... "buildings"]; figure ax = pcshow(pc.Location,labels); helperLabelColorbar(ax,classNames) title("Point Cloud with Overlaid Semantic Labels")
Preprocess Training Data
To preprocess the training data, first create a fileDatastore
object to collect all the point cloud files in the training data.
fds = fileDatastore(trainDataFolder,"ReadFcn",@lasFileReader);
Apply a transformation that performs these taks by using the helperTransformTrainData
helper function, defined at the end of this example:
Each point cloud in the DALES data set covers an area of 500-by-500 meters, which is much larger than the typical area covered by terrestrial rotating lidar point clouds. For efficient memory processing, divide the point cloud into small, non-overlapping blocks of size 50-by-50 meters.
Normalize the point cloud values to a range of [0, 1].
Downsample the point clouds and labels using grid subsampling to equalize the density of the point clouds.
Define the block dimensions.
blockSize = [50 50];
Apply the transformation to the training datastore.
ldsTransformed = transform(fds,@(x) helperTransformTrainData(x,blockSize));
Use the helperDatastoreWriteFcn
helper function, attached to this example as a supporting file, to save the preprocessed point clouds and semantic labels as PCD and PNG files, respectively. This step substantially speeds up the training process. You can set writeFiles
to false
if your preprocessed training data is already present in the outputFolder
.
writeFiles = true; outputFolder = fullfile(dataFolder,"PreprocessedTrainData"); if writeFiles writeall(ldsTransformed,outputFolder,WriteFcn = @helperDatastoreWriteFcn,FolderLayout = "flatten"); end
Starting parallel pool (parpool) using the 'local' profile ... ConProcessing done for file: 5190_54400.las
Note: Processing can take some time. The code suspends MATLAB® execution until processing is complete.
If you want to rerun the code, you must first use the "clear helperDatastoreWriteFcn"
command.
Calculate the point distribution across all the classes in the training data set using the helperComputeClassWeights
helper function, attached to this example as a supporting file. The calculated weights are applied to the weighted cross-entropy loss during training.
numClasses = numel(classNames); weights = helperComputeClassWeights(fds,numClasses);
Create Datastore Object for Training
Create a fileDatastore
object to store and read PCD files by using the pcread
function.
pcCropTrainPath = fullfile(outputFolder,"PointCloud"); ldsTrain = fileDatastore(pcCropTrainPath,"ReadFcn",@(x) pcread(x));
Use a pixelLabelDatastore
object to store pixel-wise labels from the pixel label images. The object maps each pixel label to a class name and assigns a unique label ID to each class.
% Specify label IDs from 1 to the number of classes. labelIDs = 1 : numClasses; labelCropTrainPath = fullfile(outputFolder,"Labels"); pxdsTrain = pixelLabelDatastore(labelCropTrainPath,classNames,labelIDs);
Use the helperPreprocessTrainData
helper function, attached to this example as a supporting file preprocess training data to make it compatible with the network input layer.
preprocessedTrainingData = transform(ldsTrain,pxdsTrain,@(pt,lb) helperPreprocessTrainData(pt,lb,classNames));
Define RandLANet Network
RandLANet is a popular neural network used for semantic segmentation of unorganized lidar point clouds. This network uses an encoder-decoder architecture with skip connections. You feed the input to a shared multilayer perceptron layer, followed by four encoding and decoding layers, to learn features for each point. At end, three fully connected layers and a dropout layer associate each point in a 3-D point cloud with a class label, such as car, truck, ground, or vegetation.
This figure shows the architecture of a RandLANet, where:
FC — Fully Connected Layer
LFA — Local Feature Aggregation
RS — Random Sampling
MLP — Shared Multilayer Perceptron
US — Upsampling
DP — Dropout
Each encoding layer randomly downsamples the point cloud, which significantly decreases the point density. To retain the key features of the downsampled point cloud, the network uses a local feature aggregation module on each point.
Each decoding layer upsamples the point feature set through nearest-neighbor interpolation. The network concatenates these upsampled maps with feature maps produced by the encoding layers through skip connections, and applies a shared MLP to the concatenated feature maps.
Define the RandLANet network using the helperLoadRandLANet
helper function, attached to this example as a supporting file.
% RandLANet dlnetwork
net = helperLoadRandLANet(numClasses);
Specify Training Options
Use the Adam optimization algorithm to train the network. Use the trainingOptions
(Deep Learning Toolbox) function to specify the hyperparameters.
learningRate = 0.01; numEpochs = 100; miniBatchSize = 8; learnRateDropFactor = 0.9886; executionEnvironment = "auto"; dataFormat = cell(1,26); dataFormat(:) = {'SSCB'}; options = trainingOptions("adam", ... InitialLearnRate = learningRate, ... MaxEpochs = numEpochs, ... MiniBatchSize = miniBatchSize, ... LearnRateSchedule = "piecewise", ... LearnRateDropPeriod = 1,... LearnRateDropFactor = learnRateDropFactor, ... Plots = "training-progress", ... ExecutionEnvironment = executionEnvironment, ... ResetInputNormalization = false, ... CheckpointFrequencyUnit = "epoch", ... CheckpointFrequency = 10, ... CheckpointPath = tempdir, ... InputDataFormats = dataFormat(1:25), ... TargetDataFormats = dataFormat(26));
Note: Reduce or increase the miniBatchSize
value to control memory usage when training.
Train Model
To train the network, set the doTraining
argument to true
. Otherwise, load a pretrained network. To train the network, you can use a CPU or GPU. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For more information, see GPU Computing Requirements (Parallel Computing Toolbox). Use weighted cross-entropy loss to update the network during training using the helperLossFunction
helper function, defined at the end of this example.
doTraining = false; if doTraining % Train the network on the ldsTransformed datastore using % the trainnet function. trainedNet = trainnet(preprocessedTrainingData,net,@(ypred,ytrue) helperLossFunction(ypred,ytrue,weights),options); else % Load the pretrained network. load("randlanetSegTrained","net") end
Segment Aerial Point Cloud
The network has been trained on downsampled point clouds. To perform segmentation on a test point cloud, first downsample the test point cloud, similar to the training data. Perform inference on the downsampled data to compute prediction labels. Interpolate these prediction labels to obtain prediction labels on the dense point cloud.
Read the full test point cloud.
lasReader = lasFileReader(fullfile(testDataFolder,"5080_54470.las")); [ptCloud,attr] = readPointCloud(lasReader,"Attributes","Classification"); labels = attr.Classification;
Apply a transformation that performs these tasks to the test or validation data by using the helperTransformTestData
helper function, attached to this example as a supporting file:
Crop the point clouds into non-overlapping blocks of size 50-by-50 meters.
Normalize the point cloud values to a range of [0, 1].
Downsample the point clouds using grid subsampling to equalize the density of point clouds.
For each point in the the dense point cloud, find the 1-nearest neighbor in the downsampled point cloud, to facilitate interpolation of the labels in postprocessing.
[transformedTestData,blockIndices] = helperTransformTestData(ptCloud,blockSize);
Perform inference on the preprocessed test data.
% Initialize prediction labels. labelsDensePred = zeros(size(ptCloud.Location,1),1); for i = 1:size(transformedTestData,2) ptCloudSparse = transformedTestData{1,i}{1}; kdtree = transformedTestData{1,i}{2}; projectedIndices = transformedTestData{1,i}{3}; % Randomly generate a value for each point, which represents the % probability of a point being selected to perform inference on it. possibility = rand(ptCloudSparse.Count,1)*1e-3; predLabels = zeros(ptCloudSparse.Count,1,"int16"); % This flag is set to 'false' when all points in ptCloudSparse % are covered for inference. flag = true; while flag % Use the helperPreprocessTestData helper function, attached to this % example as a supporting file, for preprocessing. [preprocessedInput,possibility] = helperPreprocessTestData(ptCloudSparse, ... kdtree,possibility,executionEnvironment); % Use the helperPredictRandLANet helper function, attached to this example as % a supporting file, for dlnetwork output. labelsSparsePred = helperPredictRandLANet(net,preprocessedInput); % Use the helperPostprocessRandLANet helper function, attached to this example % as a supporting file, for postprocessing. [interpolatedLabels,flag,predLabels] = helperPostprocessRandLANet(labelsSparsePred, ... predLabels,possibility,preprocessedInput{1,6},projectedIndices,flag); end labelsDensePred(transformedTestData{1,i}{4}) = interpolatedLabels; end
For better visualisation display only a block inferred from the point cloud data.
figure pc = select(ptCloud,transformedTestData{1,i}{4}); ax = pcshow(pc.Location,interpolatedLabels); axis off helperLabelColorbar(ax,classNames) title("Point Cloud Overlaid with Detected Semantic Labels")
Evaluate Network
To perform evaluation on the test data, get the labels from the test point cloud. Use the predicted labels you computed and are stored in labelsDensePred
.
Compute the ground truth labels, and use blockIndices
to reorder the ground truth labels to match the order of the predicted labels.
labelsDenseTarget = labels(blockIndices);
Use the evaluateSemanticSegmentation
function to compute the semantic segmentation metrics from the test set results.
confusionMatrix = segmentationConfusionMatrix(double(labelsDensePred), ...
double(labelsDenseTarget),Classes = 1:numClasses);
metrics = evaluateSemanticSegmentation({confusionMatrix},classNames,Verbose = false);
You can measure the amount of overlap per class using the intersection-over-union (IoU) metric.
The evaluateSemanticSegmentation
function returns metrics for the entire data set, for individual classes, and for each test image. To see the metrics at the data set level, inspect the metrics.DataSetMetrics
property.
metrics.DataSetMetrics
ans=1×4 table
GlobalAccuracy MeanAccuracy MeanIoU WeightedIoU
______________ ____________ _______ ___________
0.94533 0.78257 0.60613 0.90656
The data set metrics provide a high-level overview of network performance. To see the impact each class has on the overall performance, view the metrics for each class by inspecting the metrics.ClassMetrics
property.
metrics.ClassMetrics
ans=8×2 table
Accuracy IoU
________ ________
ground 0.97851 0.95348
vegetation 0.87118 0.85004
cars 0.86573 0.50844
trucks 0.10772 0.053908
powerlines 0.919 0.75115
fences 0.78975 0.35045
poles 0.75789 0.46694
buildings 0.9708 0.91463
Supporting Functions
The helperLossFunction
function performs the weighted cross-entropy loss to handle the class imbalance on the DALES data set. This penalizes the network more if it misclassifies a point that belongs to a class with lower weight.
function loss = helperLossFunction(ypred,yactual,weights) % Apply softmax on prediction. ypred = softmax(ypred); % Compute weighted cross-entropy loss. loss = crossentropy(softmax(ypred),yactual,weights,WeightsFormat="UC",Reduction="none"); loss = mean(mean(sum(loss,3),1),4); end
The helperTransformTrainData
function applies these transformations to the training data:
Each point cloud in the DALES data set covers an area of 500-by-500 meters, which is much larger than the typical area covered by terrestrial rotating lidar point clouds. For efficient memory processing, divide the point cloud into small, non-overlapping blocks of size 50-by-50 meters.
Normalize the point cloud values to a range of [0, 1].
Downsample the point clouds and labels using grid subsampling to equalize the density of the point clouds.
function out = helperTransformTrainData(lasReader,blocksize) [ptCloud,attr] = readPointCloud(lasReader,"Attributes","Classification"); labels = attr.Classification; % Select only labeled data. ptCloud = select(ptCloud,labels~=0); labels = labels(labels~=0); % Block the input point cloud. numGridsX = round((diff(ptCloud.XLimits)+eps)/blocksize(1)); numGridsY = round((diff(ptCloud.YLimits)+eps)/blocksize(2)); [~,~,~,indx,indy] = histcounts2(ptCloud.Location(:,1),ptCloud.Location(:,2), ... [numGridsX, numGridsY],XBinLimits = ptCloud.XLimits, YBinLimits = ptCloud.YLimits); ind = sub2ind([numGridsX, numGridsY],indx,indy); out = cell(1,numGridsX*numGridsY); for num = 1:numGridsX*numGridsY idx = ind == num; % Blocked point cloud and corresponding labels. ptCloudDense = select(ptCloud,idx); labelsDense= labels(idx); % Use the helperPointCloudNormalization helper function, attached to this example as a % supporting file, to normalize the point cloud values to a range of [0,1]. ptCloudNormalized = helperPointCloudNormalization(ptCloudDense); % Use the helperGridSubsampling helper function, attached to this example as a supporting file, % to downsample the point cloud and labels. gridSize = 0.04; [pointsSparse,labelsSparse] = helperGridSubsampling(ptCloudNormalized,labelsDense,gridSize); xyzAndLabelsSparse = [pointsSparse,labelsSparse]; out{1,num} = xyzAndLabelsSparse; end end
References
[1] Hu, Qingyong, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. “RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11105–14. Seattle, WA, USA: IEEE, 2020. https://doi.org/10.1109/CVPR42600.2020.01112.
[2] Varney, Nina, Vijayan K. Asari, and Quinn Graehling. “DALES: A Large-Scale Aerial LiDAR Data Set for Semantic Segmentation.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 717–26. Seattle, WA, USA: IEEE, 2020. https://doi.org/10.1109/CVPRW50498.2020.00101.
See Also
Apps
- Deep Network Designer (Deep Learning Toolbox) | Lidar Viewer | Lidar Labeler