Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

Abstract

Recent breakthroughs in large language modeling have facilitated rigorousexploration of their application in diverse tasks related to tabular datamodeling, such as prediction, tabular data synthesis, question answering, andtable understanding. Each task presents unique challenges and opportunities.However, there is currently a lack of comprehensive review that summarizes andcompares the key techniques, metrics, datasets, models, and optimizationapproaches in this research domain. This survey aims to address this gap byconsolidating recent progress in these areas, offering a thorough survey andtaxonomy of the datasets, metrics, and methodologies utilized. It identifiesstrengths, limitations, unexplored territories, and gaps in the existingliterature, while providing some insights for future research directions inthis vital and rapidly evolving field. It also provides relevant code anddatasets references. Through this comprehensive review, we hope to provideinterested readers with pertinent references and insightful perspectives,empowering them with the necessary tools and knowledge to effectively navigateand address the prevailing challenges in the field.

Quick Read (beta)

loading the full paper ...