Abstract
We study in this paper Fenchel-Young losses, a generic way to constructconvex loss functions from a convex regularizer. We provide an in-depth studyof their properties in a broad setting and show that they unify many well-knownloss functions. When constructed from a generalized entropy, which includeswell-known entropies such as Shannon and Tsallis entropies, we show thatFenchel-Young losses induce a predictive probability distribution and developan efficient algorithm to compute that distribution for separable entropies. Wederive conditions for generalized entropies to yield a distribution with sparsesupport and losses with a separation margin. Finally, we present both primaland dual algorithms to learn predictive models with generic Fenchel-Younglosses.