Abstract
Large language models are being rapidly deployed across many fields such as healthcare, finance, transportation, and energy, where time-series data are fundamental components. The current works are still limited in their ability to perform reasoning that involves both time-series and the corresponding textual content. We address this gap by introducing Chat-TS, a large language model (LLM) based framework designed to support reasoning over time series and textual data. Unlike traditional models, Chat-TS integrates time-series tokens into LLMs' vocabulary, enhancing its reasoning ability over both modalities without compromising core natural language capabilities. To support learning and evaluation, we contribute new datasets: the TS Instruct Training Dataset (pairing diverse time-series data with relevant text instructions and responses for instruction tuning), the TS Instruct Question and Answer (QA) Gold Dataset (multiple-choice questions to evaluate multimodal reasoning), and a TS Instruct Quantitative Probing Set (a small subset of TS Instruct QA reasoning tasks alongside math and decision-making questions for LLM evaluation). We design a training strategy to preserve the inherent reasoning capabilities of LLMs while augmenting them for time-series reasoning. Experiments show that Chat-TS achieves state-of-the-art performance in multimodal reasoning tasks by maintaining strong natural language proficiency while improving time-series reasoning.