Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Current deep learning approaches for indoor 3D instance segmentation often rely on multilayer perceptrons (MLPs) for feature extraction. However, MLPs struggle to effectively capture the complex spatial relationships inherent in 3D scene data. To address this issue, we propose a novel and efficient framework for 3D instance segmentation called TSPconv-Net. In contrast to existing methods that primarily depend on MLPs for feature extraction, our framework integrates a more robust feature extraction model comprising the offset-attention (OA) mechanism and submanifold sparse convolution (SSC). The proposed framework is an end-to-end network architecture. TSPconv-Net consists of a backbone network followed by a bounding box module. Specifically, the backbone network utilizes the OA mechanism to extract global features and employs SSC for local feature extraction. The bounding box module then conducts instance segmentation based on the extracted features. Experimental results demonstrate that our approach outperforms existing work on the S3DIS dataset while maintaining computational efficiency. TSPconv-Net achieves 68.6% mPrec, 52.5% mRec, and 60.1% mAP on the test set, surpassing 3D-BoNet by 3.0% mPrec, 5.4% mRec, and 2.6% mAP. Furthermore, it demonstrates high efficiency, completing computations in just 326 s.

Details

Title
TSPconv-Net: Transformer and Sparse Convolution for 3D Instance Segmentation in Point Clouds
Author
Ning, Xiaojuan 1   VIAFID ORCID Logo  ; Yule, Liu 2   VIAFID ORCID Logo  ; Ma, Yishu 2 ; Lu, Zhiwei 2 ; Jin, Haiyan 1   VIAFID ORCID Logo  ; Shi, Zhenghao 1   VIAFID ORCID Logo  ; Wang, Yinghui 3   VIAFID ORCID Logo 

 Institute of Computer Science and Engineering, Xi’an University of Technology, No. 5 South of Jinhua Road, Xi’an 710048, China; [email protected] (Y.L.); [email protected] (Y.M.); [email protected] (Z.L.); [email protected] (H.J.); [email protected] (Z.S.); Shaanxi Key Laboratory of Network Computing and Security Technology, Xi’an 710048, China 
 Institute of Computer Science and Engineering, Xi’an University of Technology, No. 5 South of Jinhua Road, Xi’an 710048, China; [email protected] (Y.L.); [email protected] (Y.M.); [email protected] (Z.L.); [email protected] (H.J.); [email protected] (Z.S.) 
 School of Artificial Intelligence and Computer Science, Jiangnan University, 1800 of Lihu Road, Wuxi 214122, China; [email protected] 
First page
2926
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
22277390
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3110582467
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.