This example shows how to develop a visual localization system using synthetic image data from the Unreal Engine® simulation environment.
It is a challenging task to obtain ground truth for evaluating the performance of a localization algorithm in different conditions. Virtual simulation in different scenarios is a cost-effective method to obtain the ground truth in comparison with more expensive approaches such as using high-precision inertial navigation systems or differential GPS. The use of simulation enables testing under a variety of scenarios and sensor configurations. It also enables a rapid algorithm development, and provides precise ground truth.
This example uses the Unreal Engine simulation environment from Epic Games® to develop and evaluate a visual localization algorithm in a parking lot scenario.
Visual localization is the process of estimating the camera pose for a captured image relative to a visual representation of a known scene. It is a key technology for applications such as augmented reality, robotics, and automated driving. Compared with a Implement Visual SLAM in MATLAB, visual localization assumes that a map of the environment is known and does not require 3-D reconstruction or loop closure detection. The pipeline of visual localization includes the following:
Map Loading: Load the pre-built map 3-D map containing world point positions and the 3-D to 2-D correspondences between the map points and the key frames. Additionally. for each key frame, load the feature descriptors corresponding to the 3-D map points.
Global Initialization: Extract features from the first image frame and match them with the features corresponding to all the 3-D map points. After getting the 3-D to 2-D correspondences, estimate the camera pose of the first frame in the world coordinate by solving a Perspective-n-Point (PnP) problem. Refine the pose using motion-only bundle adjustment. The key frame that shares the most covisible 3-D map points with the first frame is identified as the reference key frame.
Tracking: Once the first frame is localized, for each new frame, match features in the new frame with features in the reference key frame that have known 3-D world points. Estimate and refine the camera pose using the same approach as in Global Initialization step. The camera pose can be further refined by tracking the features associated with nearby key frames.
Guiding a vehicle into a parking spot is a challenging maneuver that relies on accurate localization. The
VisualLocalizationInAParkingLot model simulates a visual localization system in the parking lot scenario used in the Develop Visual SLAM Algorithm Using Unreal Engine Simulation (Automated Driving Toolbox) example.
The Simulation 3D Scene Configuration (Automated Driving Toolbox) block sets up the Large Parking Lot scene. The
Parked Vehicles subsystem adds parked cars into the parking lot.
The Simulation 3D Vehicle with Ground Following (Automated Driving Toolbox) block controls the motion of the ego vehicle.
The Simulation 3D Camera (Automated Driving Toolbox)block models a monocular camera fixed at the center of the vehicle's roof. You can use the Camera Calibrator app to estimate intrinsics of the actual camera that you want to simulate.
Helper Visual Localization MATLAB System block implements the visual localization algorithm. The initial camera pose with respect to the map is estimated using the
helperGlobalInitialization function. The subsequent camera poses are estimated using the
helperTrackingRefKeyFrame function and refined using the
helperTrackLocalKeyFrames function. This block also provides a visualization of the estimated camera trajectory in the pre-built map. You can specify the pre-built map data and the camera intrinsic parameters in the block dialog.
% Open the model modelName = 'VisualLocalizationInAParkingLot'; open_system(modelName);
The pre-built map data is generated using the stereo camera in the Develop Visual SLAM Algorithm Using Unreal Engine Simulation (Automated Driving Toolbox) example. The data consists of three objects that are commonly used to manage image and map data for visual SLAM:
imageviewset object storing the camera poses of key frames and the associated feature points for each 3-D map point in
worldpointset object storing the 3-D map point locations and the correspondences between the 3-D points and 2-D feature points across key frames. The 3-D map points provide a sparse representation of the environment.
helperViewDirectionAndDepth object storing view direction and depth of each map point in
% Load pre-built map data mapData = load("prebuiltMapData.mat")
mapData = struct with fields: vSetKeyFrames: [1×1 imageviewset] mapPointSet: [1×1 worldpointset] directionAndDepth: [1×1 helperViewDirectionAndDepth]
You can follow the Select Waypoints for Unreal Engine Simulation (Automated Driving Toolbox) example to select a sequence of waypoints and generate a reference trajectory for the ego vehicle. This example uses a recorded reference trajectory.
% Load reference path refPosesData = load('parkingLotLocalizationData.mat'); % Set reference trajectory of the ego vehicle refPosesX = refPosesData.refPosesX; refPosesY = refPosesData.refPosesY; refPosesT = refPosesData.refPosesT; % Set camera intrinsics focalLength = [700, 700]; % specified in units of pixels principalPoint = [600, 180]; % in pixels [x, y] imageSize = [370, 1230]; % in pixels [mrows, ncols]
Run the simulation and visualize the estimated camera trajectory in the pre-built map. The white points represent the tracked 3-D map points in the current frame. You can compare the estimated trajectory with the ground truth provided by the Simulation 3D Camera block to evaluate the localization accuracy.
if ~ispc error("Unreal Engine Simulation is supported only on Microsoft" + char(174) + " Windows" + char(174) + "."); end % Open video viewer to examine camera images open_system([modelName, '/Video Viewer']); % Run simulation sim(modelName);
Close the model.
close_system([modelName, '/Video Viewer']); close_system(modelName, 0);
With this setup, you can rapidly iterate over different scenarios, sensor configurations, or reference trajectories and refine the visual localization algorithm before moving to real-world testing.
To select a different scenario, use the Simulation 3D Scene Configuration block. Choose from the existing prebuilt scenes or create a custom scene in the Unreal® Editor.
To create a different reference trajectory, use the
helperSelectSceneWaypoints tool, as shown in the Select Waypoints for Unreal Engine Simulation example.
To alter the sensor configuration use the Simulation 3D Camera (Automated Driving Toolbox)block. The Mounting tab provides options for specifying different sensor mounting placements. The Parameters tab provides options for modifying sensor parameters such as detection range, field of view, and resolution. You can also use the Simulation 3D Fisheye Camera (Automated Driving Toolbox) block which provides a larger field of view.