End-to-End Optimization of LiDAR Beam Configuration for 3D Object Detection and Localization
Niclas Vödisch, Ozan Unal, Ke Li, Luc Van Gool, and Dengxin Dai
Existing learning methods for LiDAR-based applications use 3D points scanned under a pre-determined beam configuration, e.g., the elevation angles of beams are often evenly distributed. Those fixed configurations are task-agnostic, so simply using them can lead to sub-optimal performance. In this work, we take a new route to learn to optimize the LiDAR beam configuration for a given application. Specifically, we propose a reinforcement learning-based learning-to-optimize (RL-L2O) framework to automatically optimize the beam configuration in an end-to-end manner for different LiDAR-based applications. The optimization is guided by the final performance of the target task and thus our method can be integrated easily with any LiDAR-based application as a simple drop-in module. The method is especially useful when a low-resolution (low-cost) LiDAR is needed, for instance, for system deployment at a massive scale. We use our method to search for the beam configuration of a low-resolution LiDAR for two important tasks: 3D object detection and localization. Experiments show that the proposed RL-L2O method improves the performance in both tasks significantly compared to the baseline methods. We believe that a combination of our method with the recent advances of programmable LiDARs can start a new research direction for LiDAR-based active perception. The code is publicly available at github.com/vniclas/lidar_beam_selection
Figure 1. Illustration of end-to-end optimization of LiDAR beam configuration for 3D object detection and localization. The 4-beam solution space is simulated via the sampling of beams from a high-resolution LiDAR. The vast search space is then efficiently traversed to find a high-performing configuration for each individual task.
Figure 2. Illustration of our proposed reinforcement learning-based learning-to-optimize (RL-L2O) framework with two possible reward pipelines for 3D object detection and localization. (left) Block diagram of the ε-GS algorithm. In every iteration, the agent is trained using an online training set consisting of pairs of beam sets and corresponding rewards. The agent then predicts a reward for all actions and selects the action with the highest expected reward. Based on an ε-probability, the agent either applies its chosen action or explores a random state. (middle) Overview of the 3D detection pipeline used to generate a mapping from a set of LiDAR beams to a reward in R via the 3D mean AP value of a trained 3D object detector. (right) Reward design for sparse LiDAR-based localization by comparing results from ICP with ground truth poses.
Figure.3: Evolution of the beam selection algorithm for 3D object detection. (bottom) For the plot, please refer to the legend on the right. (top) Four cells are presented to visualize the best-performing beam configuration at different stages of the algorithm. The resulting LiDAR beams are shown on both the camera image and the respective point cloud in red. As can be seen, the algorithm converges slowly to a state where all beams lie near the horizon to cover most (moderate difficulty) car points. The respective 3D object detection results are shown for each highlighted step. The bounding boxes are colored green if the estimation is a true positive and red otherwise.
Figure 4. Evolution of the beam selection algorithm for LiDAR-based localization. (bottom) For the plot, please refer to the legend at the bottom. (top) Three cells are presented to visualize the best-performing beam configuration at different stages of the algorithm. The resulting LiDAR beams are shown on the camera image in red. The algorithm slowly converges to a state, where the beams are distributed such that constraints for all three spatial dimensions are incorporated while focusing on static objects. The lower part of each cell plots the lateral error of the entire route with blue and red denoting small and large errors, respectively.