Graph Convolutional Networks for 3D Scene Reconstruction

Naveen Kumar N , GiribabuKande  , and B Prabhakar Rao

doi:10.7492/zzbe5s19

Authors

Naveen Kumar N , GiribabuKande , and B Prabhakar Rao Author

DOI:

https://doi.org/10.7492/zzbe5s19

Abstract

We introduce a novel framework for holistic 3D scene Reconstruction from a single RGB image, aiming to jointly infer object geometry, object poses, and global scene layout. Due to the severe ambiguity of monocular 3D perception, prior approaches often struggle to accurately recover object shapes and spatial layouts, particularly in cluttered environments with heavy inter-object occlusions. To address these challenges, we leverage recent advances in deep learning representations. Specifically, we design an image-driven, locally structured graphical network to enhance object-level shape reconstruction, and further improve 3D object pose estimation and scene layout reasoning through a new implicit scene graph neural network that effectively aggregates local object features. Comprehensive experiments on different datasets such as SUN RGB-D and Pix3D dataset demonstrate that our approach consistently surpasses state-of-the-art methods in object shape reconstruction, scene layout estimation, and 3D object detection.