Provable Weak-to-Strong Generalization via Benign Overfitting

Abstract

The classic teacher-student model in machine learning posits that a strongteacher supervises a weak student to improve the student's capabilities. Weinstead consider the inverted situation, where a weak teacher supervises astrong student with imperfect pseudolabels. This paradigm was recently broughtforth by Burns et al.'23 and termed \emph{weak-to-strong generalization}. Wetheoretically investigate weak-to-strong generalization for binary andmultilabel classification in a stylized overparameterized spiked covariancemodel with Gaussian covariates where the weak teacher's pseudolabels areasymptotically like random guessing. Under these assumptions, we provablyidentify two asymptotic phases of the strong student's generalization afterweak supervision: (1) successful generalization and (2) random guessing. Ourtechniques should eventually extend to weak-to-strong multiclassclassification. Towards doing so, we prove a tight lower tail inequality forthe maximum of correlated Gaussians, which may be of independent interest.Understanding the multilabel setting reinforces the value of using logits forweak supervision when they are available.

Quick Read (beta)

loading the full paper ...