https://scholars.lib.ntu.edu.tw/handle/123456789/632858
標題: | 3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications | 作者: | Guo, Yun Chih Weng, Tzu Hsuan Fischer, Robin LI-CHEN FU |
關鍵字: | Augmented reality | Deep learning | Magic leap | Scene understanding | Semantic segmentation | 公開日期: | 1-十一月-2022 | 出版社: | ACADEMIC PRESS INC ELSEVIER SCIENCE | 卷: | 224 | 來源出版物: | Computer Vision and Image Understanding | 摘要: | 3D semantic segmentation of indoor scenes is a popular research topic in the field of computer vision. For many applications, it is very important to know exactly what category each point in the scene belongs to. Benefiting from the development of deep learning, many neural networks based on voxels and points have been proposed to solve these segmentation problems. However, most of them do not fully consider the information of the spatial structure. Current voxel-based sparse convolutional neural networks can effectively extract 3D features in space. However, they assume that the feature in the empty space is zero, causing a loss of information in the spatial structure. In this paper, we propose a system that uses scene point clouds with color information to semantically segment an entire indoor scene. Based on the sparsity of spatial data, we design a novel spatial-aware sparse convolution operation. We encode the spatial information of the object's existence as an additional feature and use the self-attention mechanism to effectively aggregate features. In addition, we introduce a completion network to refine the results from the segmentation network, so that each object in the scene is fitted into a more reasonable and complete shape. Through the above two methods, we build an accurate scene semantic segmentation network to obtain the semantic information of the entire scene. In the experimental part, we use two public datasets to perform quantitative and qualitative analysis. We compare our results with other state-of-the-art methods to prove the superiority of our method. Our models are also examined under different configurations to assure the effectiveness of the proposed method. Finally, the semantic segmentation model was integrated into a real-world application to demonstrate its usefulness. We expect that the proposed 3D scene semantic segmentation system can provide accurate and fast results for practical applications. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/632858 | ISSN: | 10773142 | DOI: | 10.1016/j.cviu.2022.103550 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。