Abstract
Confidence bounds are an essential tool for rigorously quantifying theuncertainty of predictions. In this capacity, they can inform theexploration-exploitation trade-off and form a core component in many sequentiallearning and decision-making algorithms. Tighter confidence bounds give rise toalgorithms with better empirical performance and better performance guarantees.In this work, we use martingale tail bounds and finite-dimensionalreformulations of infinite-dimensional convex programs to establish newconfidence bounds for sequential kernel regression. We prove that our newconfidence bounds are always tighter than existing ones in this setting. Weapply our confidence bounds to the kernel bandit problem, where future actionsdepend on the previous history. When our confidence bounds replace existingones, the KernelUCB (GP-UCB) algorithm has better empirical performance, amatching worst-case performance guarantee and comparable computational cost.Our new confidence bounds can be used as a generic tool to design improvedalgorithms for other kernelised learning and decision-making problems.