使用 GPU Coder 优化车道检测
此示例说明如何开发在 NVIDIA® GPU 上运行的深度学习车道检测应用。
预训练的车道检测网络可以从图像中检测并输出车道标记边界,并且基于 AlexNet 网络。AlexNet 网络的最后几层被更小的全连接层和回归输出层取代。该示例生成一个 CUDA® 可执行文件,它在主机上支持 CUDA 的 GPU 上运行。
第三方前提条件
支持 CUDA 的 NVIDIA GPU。
NVIDIA CUDA 工具包和驱动。
NVIDIA cuDNN 库。
编译器和库的环境变量。有关支持的编译器和库的版本的信息,请参阅Third-Party Hardware (GPU Coder)。有关设置环境变量的信息,请参阅Setting Up the Prerequisite Products (GPU Coder)。
验证 GPU 环境
使用 coder.checkGpuInstall (GPU Coder) 函数验证运行此示例所需的编译器和库是否已正确设置。
envCfg = coder.gpuEnvConfig('host'); envCfg.DeepLibTarget = 'cudnn'; envCfg.DeepCodegen = 1; envCfg.Quiet = 1; coder.checkGpuInstall(envCfg);
获取预训练的车道检测网络
此示例使用包含预训练的车道检测网络的 trainedLaneNet MAT 文件。此文件大小约为 143 MB。从 MathWorks® 网站下载该文件。
laneNetFile = matlab.internal.examples.downloadSupportFile('gpucoder/cnn_models/lane_detection', ... 'trainedLaneNet.mat');
该网络将图像作为输入并输出两个车道边界,分别对应于自我意识车辆的左右车道。每个车道边界都由抛物线方程 表示,其中 y 是横向偏移,x 是与车辆的纵向距离。该网络为每个车道输出三个参数 a、b 和 c。网络架构类似于 AlexNet,但是最后几层会替换为较小的全连接层和回归输出层。
load(laneNetFile); disp(laneNet)
SeriesNetwork with properties:
Layers: [23×1 nnet.cnn.layer.Layer]
InputNames: {'data'}
OutputNames: {'output'}
将 SeriesNetwork 对象转换为 dlnetwork 对象,并将其保存到不同 MAT 文件中。
dlLaneNet = dag2dlnetwork(laneNet); dlLaneNetFile = 'trainedDlLaneNet.mat'; save(dlLaneNetFile,'dlLaneNet');
下载测试视频
为了测试该模型,该示例使用来自加州理工学院车道数据集的视频文件。该文件的大小约为 8 MB。从 MathWorks 网站下载该文件。
videoFile = matlab.internal.examples.downloadSupportFile('gpucoder/media','caltech_cordova1.avi');
主要入口函数
detectLanesInVideo.m 文件是代码生成的主要入口函数。detectLanesInVideo 函数使用 VideoReader 对象从输入视频中读取帧,调用 LaneNet 网络对象的预测方法,并绘制在输入视频中检测到的车道。vision.DeployableVideoPlayer (Computer Vision Toolbox) System object 用于显示检测到车道的视频输出。
type detectLanesInVideo.mfunction detectLanesInVideo(videoFile,net,laneCoeffMeans,laneCoeffsStds)
% Copyright 2022-2024 The MathWorks, Inc.
%#codegen
%% Create Video Reader and Video Player Object
videoFReader = VideoReader(videoFile);
depVideoPlayer = vision.DeployableVideoPlayer(Name='Lane Detection on GPU');
videoHeight = videoFReader.Height;
videoWidth = videoFReader.Width;
%% Video Frame Processing Loop
while hasFrame(videoFReader)
videoFrame = imresize(readFrame(videoFReader),[videoHeight videoWidth]);
scaledFrame = imresize(videoFrame,[227 227]);
[laneFound,ltPts,rtPts] = laneNetPredict(net,scaledFrame, ...
laneCoeffMeans,laneCoeffsStds);
lnaeFound = 1;
if(laneFound)
pts = [reshape(ltPts',1,[]);reshape(rtPts',1,[])];
videoFrame = insertShape(videoFrame, 'Line', pts, 'LineWidth', 4);
depVideoPlayer(videoFrame);
end
end
LaneNet 预测函数
laneNetPredict 函数计算左右车道在单个视频帧中的位置。laneNet 网络会计算参数 a、b 和 c,这些参数描述了左右车道边界的抛物线方程。根据这些参数,计算与车道位置对应的 x 和 y 坐标。这些坐标必须映射到图像坐标。
type laneNetPredict.mfunction [laneFound,ltPts,rtPts] = laneNetPredict(net,frame,means,stds)
%#codegen
% laneNetPredict Predict lane markers on the input image frame using the
% lane detection network
% Copyright 2017-2024 The MathWorks, Inc.
dlFrame = dlarray(single(frame),'SSC');
persistent dllaneNet;
if isempty(dllaneNet)
dllaneNet = coder.loadDeepLearningNetwork(net, 'dllaneNet');
end
dllanecoeffsNetworkOutput = predict(dllaneNet,dlFrame);
lanecoeffsNetworkOutput = extractdata(dllanecoeffsNetworkOutput);
% Recover original coeffs by reversing the normalization steps.
params = lanecoeffsNetworkOutput' .* stds + means;
% 'c' should be more than 0.5 for it to be a lane.
isRightLaneFound = abs(params(6)) > 0.5;
isLeftLaneFound = abs(params(3)) > 0.5;
% From the networks output, compute left and right lane points in the image
% coordinates.
vehicleXPoints = 3:30;
ltPts = coder.nullcopy(zeros(28,2,'single'));
rtPts = coder.nullcopy(zeros(28,2,'single'));
if isRightLaneFound && isLeftLaneFound
rtBoundary = params(4:6);
rt_y = computeBoundaryModel(rtBoundary, vehicleXPoints);
ltBoundary = params(1:3);
lt_y = computeBoundaryModel(ltBoundary, vehicleXPoints);
% Visualize lane boundaries of the ego vehicle.
tform = get_tformToImage;
% Map vehicle to image coordinates.
ltPts = tform.transformPointsInverse([vehicleXPoints', lt_y']);
rtPts = tform.transformPointsInverse([vehicleXPoints', rt_y']);
laneFound = true;
else
laneFound = false;
end
end
%% Helper Functions
% Compute boundary model.
function yWorld = computeBoundaryModel(model, xWorld)
yWorld = polyval(model, xWorld);
end
% Compute extrinsics.
function tform = get_tformToImage
%The camera coordinates are described by the caltech mono
% camera model.
yaw = 0;
pitch = 14; % Pitch of the camera in degrees
roll = 0;
translation = translationVector(yaw, pitch, roll);
rotation = rotationMatrix(yaw, pitch, roll);
% Construct a camera matrix.
focalLength = [309.4362, 344.2161];
principalPoint = [318.9034, 257.5352];
Skew = 0;
camMatrix = [rotation; translation] * intrinsicMatrix(focalLength, ...
Skew, principalPoint);
% Turn camMatrix into 2-D homography.
tform2D = [camMatrix(1,:); camMatrix(2,:); camMatrix(4,:)]; % drop Z
tform = projective2d(tform2D);
tform = tform.invert();
end
% Translate to image co-ordinates.
function translation = translationVector(yaw, pitch, roll)
SensorLocation = [0 0];
Height = 2.1798; % mounting height in meters from the ground
rotationMatrix = (...
rotZ(yaw)*... % last rotation
rotX(90-pitch)*...
rotZ(roll)... % first rotation
);
% Adjust for the SensorLocation by adding a translation.
sl = SensorLocation;
translationInWorldUnits = [sl(2), sl(1), Height];
translation = translationInWorldUnits*rotationMatrix;
end
% Rotation around X-axis.
function R = rotX(a)
a = deg2rad(a);
R = [...
1 0 0;
0 cos(a) -sin(a);
0 sin(a) cos(a)];
end
% Rotation around Y-axis.
function R = rotY(a)
a = deg2rad(a);
R = [...
cos(a) 0 sin(a);
0 1 0;
-sin(a) 0 cos(a)];
end
% Rotation around Z-axis.
function R = rotZ(a)
a = deg2rad(a);
R = [...
cos(a) -sin(a) 0;
sin(a) cos(a) 0;
0 0 1];
end
% Given the Yaw, Pitch, and Roll, determine the appropriate Euler angles
% and the sequence in which they are applied to align the camera's
% coordinate system with the vehicle coordinate system. The resulting
% matrix is a Rotation matrix that together with the Translation vector
% defines the extrinsic parameters of the camera.
function rotation = rotationMatrix(yaw, pitch, roll)
rotation = (...
rotY(180)*... % last rotation: point Z up
rotZ(-90)*... % X-Y swap
rotZ(yaw)*... % point the camera forward
rotX(90-pitch)*... % "un-pitch"
rotZ(roll)... % 1st rotation: "un-roll"
);
end
% Intrinsic matrix computation.
function intrinsicMat = intrinsicMatrix(FocalLength, Skew, PrincipalPoint)
intrinsicMat = ...
[FocalLength(1) , 0 , 0; ...
Skew , FocalLength(2) , 0; ...
PrincipalPoint(1), PrincipalPoint(2), 1];
end
生成 CUDA 可执行文件
要为 detectLanesInVideo 入口函数生成独立的 CUDA 可执行文件,请为 'exe' 目标创建一个 GPU 代码配置对象,并将目标语言设置为 C++。使用 coder.DeepLearningConfig (GPU Coder) 函数创建一个 CuDNN 深度学习配置对象,并将其赋给 GPU 代码配置对象的 DeepLearningConfig 属性。
cfg = coder.gpuConfig('exe'); cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn'); cfg.GenerateReport = true; cfg.GenerateExampleMain = "GenerateCodeAndCompile"; cfg.TargetLang = 'C++'; inputs = {coder.Constant(videoFile),coder.Constant(dlLaneNetFile), ... coder.Constant(laneCoeffMeans),coder.Constant(laneCoeffsStds)};
运行 codegen 命令。
codegen -args inputs -config cfg detectLanesInVideo
Code generation successful: View report
运行可执行文件
要运行可执行文件,请取消注释以下代码行。
if ispc [status,cmdout] = system("detectLanesInVideo.exe"); else [status,cmdout] = system("./detectLanesInVideo"); end
