Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation

Abstract

Bipartite ranking is a fundamental supervised learning problem, with the goalof learning a ranking over instances with maximal Area Under the ROC Curve(AUC) against a single binary target label. However, one may often observemultiple binary target labels, e.g., from distinct human annotators. How canone synthesize such labels into a single coherent ranking? In this work, weformally analyze two approaches to this problem -- loss aggregation and labelaggregation -- by characterizing their Bayes-optimal solutions. We show thatwhile both approaches can yield Pareto-optimal solutions, loss aggregation canexhibit label dictatorship: one can inadvertently (and undesirably) favor onelabel over others. This suggests that label aggregation can be preferable toloss aggregation, which we empirically verify.

Quick Read (beta)

loading the full paper ...