Abstract: We consider the adversarial multi-armed bandit problem with partial feedback, minimizing a non-negative loss function using the graph based feedback framework introduced by Mannor and Shamir in 2011. We offer algorithms that attain small loss bounds, as well as low approximate regret against a shifting comparator.
Classical learning algorithms add a low level of uniform noise to the algorithm’s choice to limit the variance of the loss estimator used in importance sampling, which also helps the algorithm to shift to a new arm fast enough when the comparator changes. However, such an approach poses significant hurdles to proving small-loss or low approximate regret bounds. We show that a different general technique of freezing arms, rather than adding random noise, does much better in this setting. The idea of freezing arms was proposed by Allenberg et al. in 2006 in the context of bandit learning with multiplicative weights. We show the broad applicability of this technique by extending it to partial information feedback (via a novel dual freezing thresholding technique), and to shifting comparators.
Join work with Thodoris Lykouris and Karthik Sridharan
Bio: Eva Tardos is a Jacob Gould Schurman Professor of Computer Science at Cornell University, was Computer Science department chair 2006-2010. She received her BA and PhD from Eotvos University in Budapest. She joined the faculty at Cornell in 1989. Tardos’s research interest is algorithms and algorithmic game theory. She is most known for her work on network-flow algorithms and quantifying the efficiency of selfish routing. She has been elected to the National Academy of Engineering, the National Academy of Sciences, the American Academy of Arts and Sciences, and is an external member of the Hungarian Academy of Sciences. She is the recipient of a number of fellowships and awards including the Packard Fellowship, the Goedel Prize, Dantzig Prize, Fulkerson Prize, ETACS prize, and the IEEE Technical Achievement Award. She is editor editor-in-Chief of the Journal of the ACM, and was editor in the past of several other journals including the SIAM Journal of Computing, and Combinatorica, served as problem committee member for many conferences, and was program committee chair for SODA’96, FOCS’05, and EC’13.