VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition

1Technical University of Munich, 2Munich Center for Machine Learning, 3Microsoft
arXiv 2024
MY ALT TEXT

Voxel-Cross-Pixel (VXP) can effectively map data from different modalities into a same shared space, enabling more robust and flexible place recognition. The above videos show differnt types of retrievels done by VXP.

Abstract

Recent works on the global place recognition treat the task as a retrieval problem, where an off-the-shelf global descriptor is commonly designed in image-based and LiDAR-based modalities. However, it is non-trivial to perform accurate image-LiDAR global place recognition since extracting consistent and robust global descriptors from different domains (2D images and 3D point clouds) is challenging. To address this issue, we propose a novel Voxel-Cross-Pixel (VXP) approach, which establishes voxel and pixel correspondences in a self-supervised manner and brings them into a shared feature space. Specifically, VXP is trained in a two-stage manner that first explicitly exploits local feature correspondences and enforces similarity of global descriptors. Extensive experiments on the three benchmarks (Oxford RobotCar, ViViD++ and KITTI) demonstrate our method surpasses the state-of-the-art cross-modal retrieval by a large margin. The code will be publicly available.

Robust Retrieval at Night

2D-2D place recognition fails in this situation (query: night, database: evening). However, 3D-2D place recognition can still work robustly.

Different Light Condition

Sometimes, 2D-2D place recognition fails to retrieve the most precise image location in the database (query: day1, database: evening). However, 2D-3D place recognition can provide more accurate and consistent retrievals even with the presence of light condition changes.

BibTeX

@article{li2024vxp,
        title={VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition},
        author={Li, Yun-Jin and Gladkova, Mariia and Xia, Yan and Wang, Rui and Cremers, Daniel},
        journal={arXiv preprint arXiv:2403.14594},
        year={2024}
      }