Embodied Question Answering

Creators: Das, Abhishek; Datta, Samyak; Gkioxari, Georgia; Lee, Stefan; Parikh, Devi; Batra, Dhruv

Style

An error occurred while generating the citation.

Abstract

We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?"). In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question ("orange"). This challenging task requires a range of AI skills -- active perception, language understanding, goal-driven navigation, commonsense reasoning, and grounding of language into actions. In this work, we develop the environments, end-to-end-trained reinforcement learning agents, and evaluation protocols for EmbodiedQA.

Additional Information

We are grateful to the developers of PyTorch [38] for building an excellent framework. We thank Yuxin Wu for help with the House3D environment. This work was funded in part by NSF CAREER awards to DB and DP, ONR YIP awards to DP and DB, ONR Grant N00014-14-1-0679 to DB, ONR Grant N00014-16-1-2713 to DP, an Allen Distinguished Investigator award to DP from the Paul G. Allen Family Foundation, Google Faculty Research Awards to DP and DB, Amazon Academic Research Awards to DP and DB, AWS in Education Research grant to DB, and NVIDIA GPU donations to DB. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.

Additional details

Views

Downloads

	All versions	This version
Views	26	26
Downloads	0	0
Data volume	0 Bytes	0 Bytes

More info on how stats are collected....

Resource type: Discussion Paper
Publisher: arXiv