Recognizing Scenes from Novel Viewpoints

Creators: Qian, Shengyi; Kirillov, Alexander; Ravi, Nikhila; Chaplot, Devendra Singh; Johnson, Justin; Fouhey, David F.; Gkioxari, Georgia

Abstract

Humans can perceive scenes in 3D from a handful of 2D views. For AI agents, the ability to recognize a scene from any viewpoint given only a few images enables them to efficiently interact with the scene and its objects. In this work, we attempt to endow machines with this ability. We propose a model which takes as input a few RGB images of a new scene and recognizes the scene from novel viewpoints by segmenting it into semantic categories. All this without access to the RGB images from those views. We pair 2D scene recognition with an implicit 3D representation and learn from multi-view 2D annotations of hundreds of scenes without any 3D supervision beyond camera poses. We experiment on challenging datasets and demonstrate our model's ability to jointly capture semantics and geometry of novel scenes with diverse layouts, object types and shapes.

Additional Information

We thank Shuaifeng Zhi for his help of Semantic-NeRF, Oleksandr Maksymets and Yili Zhao for their help of AI-Habitat, and Ang Cao, Chris Rockwell, Linyi Jin, Nilesh Kulkarni for helpful discussions.

Attached Files

Submitted - 2112.01520.pdf

Files

2112.01520.pdf

Files (36.9 MB)

Name	Size	Download all
2112.01520.pdf md5:d8b6b255d3b447b6462a1f146dad9182	36.9 MB	Preview Download

Additional details

	All versions	This version
Views	24	24
Downloads	9	9
Data volume	332.3 MB	332.3 MB