StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Abstract

Structured data sources, such as tables, graphs, and databases, areubiquitous knowledge sources. Despite the demonstrated capabilities of largelanguage models (LLMs) on plain text, their proficiency in interpreting andutilizing structured data remains limited. Our investigation reveals a notabledeficiency in LLMs' ability to process structured data, e.g., ChatGPT lagsbehind state-of-the-art (SoTA) model by an average of 35%. To augment theStructured Knowledge Grounding (SKG) capabilities in LLMs, we have developed acomprehensive instruction tuning dataset comprising 1.1 million examples.Utilizing this dataset, we train a series of models, referred to as StructLM,based on the Code-LLaMA architecture, ranging from 7B to 34B parameters. OurStructLM series surpasses task-specific models on 14 out of 18 evaluateddatasets and establishes new SoTA achievements on 7 SKG tasks. Furthermore,StructLM demonstrates strong generalization across 6 novel held-out SKG tasks,outperforming TableLlama by an average of 35\% and Flan-UL2 20B by an averageof 10\%. Contrary to expectations, we observe that scaling model size offersmarginal benefits, with StructLM-34B showing only slight improvements overStructLM-7B. This suggests that structured knowledge grounding is still achallenging task and requires more innovative design to push to a new level.

Quick Read (beta)

loading the full paper ...