On the Power of Foundation Models

Abstract

With infinitely many high-quality data points, infinite computational power,an infinitely large foundation model with a perfect training algorithm andguaranteed zero generalization error on the pretext task, can the model be usedfor everything? This question cannot be answered by the existing theory ofrepresentation, optimization or generalization, because the issues they mainlyinvestigate are assumed to be nonexistent here. In this paper, we show thatcategory theory provides powerful machinery to answer this question. We haveproved three results. The first one limits the power of prompt-based learning,saying that the model can solve a downstream task with prompts if and only ifthe task is representable. The second one says fine tuning does not have thislimit, as a foundation model with the minimum required power (up to symmetry)can theoretically solve downstream tasks for the category defined by pretexttask, with fine tuning and enough resources. Our final result can be seen as anew type of generalization theorem, showing that the foundation model cangenerate unseen objects from the target category (e.g., images) using thestructural information from the source category (e.g., texts). Along the way,we provide a categorical framework for supervised and self-supervised learning,which might be of independent interest.

Quick Read (beta)

loading the full paper ...