Control Regularization for Reduced Variance Reinforcement Learning

Creators: Cheng, Richard; Verma, Abhinav; Orosz, Gábor; Chaudhuri, Swarat; Yue, Yisong; Burdick, Joel W.

Style

An error occurred while generating the citation.

Abstract

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.

Additional Information

Attached Files

Published - cheng19a.pdf

Submitted - 1905.05380.pdf

Supplemental Material - cheng19a-supp.pdf

Files

1905.05380.pdf

Files (3.9 MB)

Name	Size	Download all
1905.05380.pdf md5:24aa70ebec21d7f11a1ffa63b473559f	1.9 MB	Preview Download
cheng19a-supp.pdf md5:cdc285ebc6472cd4180449013a5243bc	829.5 kB	Preview Download
cheng19a.pdf md5:213b66f2087d6abba401118de75163af	1.2 MB	Preview Download

Additional details

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes