Fine-Tuning a Time Series Foundation Model with Wasserstein Loss

Abstract

Inspired by recent advancements in large language models (LLMs) for NaturalLanguage Processing (NLP), there has been a surge in research focused ondeveloping foundational models for time series forecasting. One approachinvolves training LLM architectures on tokenized time series data usingcross-entropy loss. Although this method has demonstrated promising results,cross-entropy loss is primarily designed for classification tasks and does notaccount for the distance between classes. To address this limitation, wepropose using the Wasserstein loss for such architectures. To validate ourapproach, we fine-tuned a foundational time series model on $22$ zero-shotdatasets, comparing the performance of cross-entropy loss with that ofWasserstein loss. Our results demonstrate that replacing cross-entropy losswith Wasserstein loss significantly improves point estimation.

Quick Read (beta)

loading the full paper ...