Content area
Image-based indoor localization is a promising approach to enhancing facility management efficiency. However, ensuring localization accuracy and improving data accessibility remain key challenges. Therefore, this research aims to automatedly localize images captured during facility inspections by matching the viewpoint of the camera with a corresponding viewpoint in a Building Information Modeling-based (BIM) simulated environment. In this paper, we present a framework that generates photorealistic synthetic images and trains a deep learning model for camera pose estimation. Synthetic datasets are generated in a simulation environment, allowing precise control over scene parameters, camera positions, and lighting conditions. This allows the creation of diverse and realistic training data tailored to specific facility environments. The deep learning model takes RGB images, semantic segmented maps, and corresponding camera poses as inputs to predict sixdegree-of-freedom (6DOF) camera poses, including position and orientation. Experimental results demonstrate that the proposed approach can enable indoor image localization with an average translation error of 5.8 meters and a rotation error of 69.05 degrees.
Details
1 Department of Civil Engineering, National Taiwan University, Taiwan
2 Lab for Service Robot Systems, Delta Research Center, Taiwan