Safe Exploration for Optimization with Gaussian Processes

Creators: Sui, Yanan; Gotovos, Alkis; Burdick, Joel W.; Krause, Andreas

Style

An error occurred while generating the citation.

Abstract

We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multi-armed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified "safety" threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where patient comfort must be guaranteed, recommender systems aiming to avoid user dissatisfaction, and robotic control, where one seeks to avoid controls causing physical harm to the platform. We tackle this novel, yet rich, set of problems under the assumption that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop an efficient algorithm called SafeOpt, and theoretically guarantee its convergence to a natural notion of optimum reachable under safety constraints. We evaluate SafeOpt on synthetic data, as well as two real applications: movie recommendation, and therapeutic spinal cord stimulation.

Additional Information

© 2015 by the author(s). This work was partially supported by the Christopher and Dana Reeve Foundation, the National Institutes of Health (NIH), Swiss National Science Foundation Grant 200020 159557 and ERC Starting Grant 307036.

Attached Files

Published - sui15.pdf

Files

sui15.pdf

Files (369.7 kB)

Name	Size	Download all
sui15.pdf md5:bba6dd164534cacb9cce44a0647a2cba	369.7 kB	Preview Download

Additional details

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes