Published July 2015
| Published
Journal Article
Open
Batched Bandit Problems
Chicago
Abstract
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic multi-armed bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives already close to minimax optimal regret bounds and we also evaluate the number of trials in each batch. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.
Additional Information
© 2015 V. Perchet, P. Rigollet, S. Chassang & E. Snowberg. Philippe Rigollet acknowledges the support of NSF grants DMS-1317308, CAREER-DMS-1053987, the Howard B. Wentz Jr. Junior Faculty award and the Meimaris family. Vianney Perchet received support from the French National Research Agency (ANR) Project GAGA: ANR-13-JS01-0004-01. Sylvain Chassang and Erik Snowberg acknowledge support from NSF grant SES-1156154.Attached Files
Published - Perchet15.pdf
Files
Perchet15.pdf
Files
(139.3 kB)
Name | Size | Download all |
---|---|---|
md5:7bcecae41e727c4e68b0abd2b9b3f067
|
139.3 kB | Preview Download |
Additional details
- Eprint ID
- 101457
- Resolver ID
- CaltechAUTHORS:20200221-110809185
- NSF
- DMS-1317308
- NSF
- DMS-1053987
- Howard B. Wentz Jr. Junior Faculty Award
- Meimaris Family
- Agence Nationale pour la Recherche (ANR)
- ANR-13-JS01-0004-01
- NSF
- SES-1156154
- Created
-
2020-02-21Created from EPrint's datestamp field
- Updated
-
2020-02-21Created from EPrint's last_modified field