Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published 2016 | Submitted + Supplemental Material + Published
Journal Article Open

Batched bandit problems

Abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Additional Information

© 2016 Institute of Mathematical Statistics. Received May 2015; revised August 2015. Supported by ANR Grant ANR-13-JS01-0004. Supported by NSF Grants DMS-13-17308 and CAREER. Supported by NSF Grant SES-1156154.

Attached Files

Published - euclid.aos.1458245731.pdf

Submitted - 1505.00369v3.pdf

Supplemental Material - euclid.aos.1458245731_si.pdf

Files

euclid.aos.1458245731.pdf
Files (745.9 kB)
Name Size Download all
md5:506aeea864febe923b897c42440de75c
248.0 kB Preview Download
md5:5881681e260edfe8b381c7913203517b
371.9 kB Preview Download
md5:8d9a985fe2033a879acce427202018f6
126.0 kB Preview Download

Additional details

Created:
August 20, 2023
Modified:
October 18, 2023