Towards Foundation Models for Learning on Tabular Data

  • 2023-10-11 10:37:38
  • Han Zhang, Xumeng Wen, Shun Zheng, Wei Xu, Jiang Bian
  • 0

Abstract

Learning on tabular data underpins numerous real-world applications. Despiteconsiderable efforts in developing effective learning models for tabular data,current transferable tabular models remain in their infancy, limited by eitherthe lack of support for direct instruction following in new tasks or theneglect of acquiring foundational knowledge and capabilities from diversetabular datasets. In this paper, we propose Tabular Foundation Models (TabFMs)to overcome these limitations. TabFMs harness the potential of generativetabular learning, employing a pre-trained large language model (LLM) as thebase model and fine-tuning it using purpose-designed objectives on an extensiverange of tabular datasets. This approach endows TabFMs with a profoundunderstanding and universal capabilities essential for learning on tabulardata. Our evaluations underscore TabFM's effectiveness: not only does itsignificantly excel in instruction-following tasks like zero-shot andin-context inference, but it also showcases performance that approaches, and ininstances, even transcends, the renowned yet mysterious closed-source LLMs likeGPT-4. Furthermore, when fine-tuning with scarce data, our model achievesremarkable efficiency and maintains competitive performance with abundanttraining data. Finally, while our results are promising, we also delve intoTabFM's limitations and potential opportunities, aiming to stimulate andexpedite future research on developing more potent TabFMs.

 

Quick Read (beta)

loading the full paper ...