Learning Causal State Representations of Partially Observable Environments
Abstract
Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate causal states, which optimally compress the joint history of actions and observations in partially-observable Markov decision processes. Our proposed algorithm extracts causal state representations from RNNs that are trained to predict subsequent observations given the history. We demonstrate that these learned task-agnostic state abstractions can be used to efficiently learn policies for reinforcement learning problems with rich observation spaces. We evaluate agents using multiple partially observable navigation tasks with both discrete (GridWorld) and continuous (VizDoom, ALE) observation processes that cannot be solved by traditional memory-limited methods. Our experiments demonstrate systematic improvement of the DQN and tabular models using approximate causal state representations with respect to recurrent-DQN baselines trained with raw inputs.
Additional Information
Part of this work was supported by the National Science Foundation (grant number CCF-1317433), C-BRIC (one of six centers in JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA), and the Intel Corporation. A. Anandkumar is supported in part by Bren endowed chair, Darpa PAI, Raytheon, and Microsoft, Google and Adobe faculty fellowships. K. Azizzadenesheli is supported in part by NSF Career Award CCF-1254106 and AFOSR YIP FA9550-15-1-0221, work done while he was visiting Caltech. The authors affirm that the views expressed herein are solely their own, and do not represent the views of the United States government or any agency thereof.Attached Files
Submitted - 1906.10437.pdf
Files
Name | Size | Download all |
---|---|---|
md5:51870aebf1ee84c72e313713c47cfd2b
|
4.1 MB | Preview Download |
Additional details
- Eprint ID
- 98452
- Resolver ID
- CaltechAUTHORS:20190905-154244448
- NSF
- CCF-1317433
- Semiconductor Research Corporation
- Defense Advanced Research Projects Agency (DARPA)
- Intel
- Bren Professor of Computing and Mathematical Sciences
- Raytheon Company
- Microsoft Faculty Fellowship
- Google Faculty Research Award
- Adobe
- NSF
- CCF-1254106
- Air Force Office of Scientific Research (AFOSR)
- FA9550-15-1-0221
- Created
-
2019-09-06Created from EPrint's datestamp field
- Updated
-
2023-06-02Created from EPrint's last_modified field