Unconstrained Robust Online Convex Optimization

Abstract

This paper addresses online learning with ``corrupted'' feedback. Our learneris provided with potentially corrupted gradients $\tilde g_t$ instead of the``true'' gradients $g_t$. We make no assumptions about how the corruptionsarise: they could be the result of outliers, mislabeled data, or even maliciousinterference. We focus on the difficult ``unconstrained'' setting in which ouralgorithm must maintain low regret with respect to any comparison point $u \in\mathbb{R}^d$. The unconstrained setting is significantly more challenging asexisting algorithms suffer extremely high regret even with very tiny amounts ofcorruption (which is not true in the case of a bounded domain). Our algorithmsguarantee regret $ \|u\|G (\sqrt{T} + k) $ when $G \ge \max_t \|g_t\|$ isknown, where $k$ is a measure of the total amount of corruption. When $G$ isunknown we incur an extra additive penalty of $(\|u\|^2+G^2) k$.

Quick Read (beta)

loading the full paper ...