Abstract
Multi-horizon forecasting problems often contain a complex mix of inputs --including static (i.e. time-invariant) covariates, known future inputs, andother exogenous time series that are only observed historically -- without anyprior information on how they interact with the target. While several deeplearning models have been proposed for multi-step prediction, they typicallycomprise black-box models which do not account for the full range of inputspresent in common scenarios. In this paper, we introduce the Temporal FusionTransformer (TFT) -- a novel attention-based architecture which combineshigh-performance multi-horizon forecasting with interpretable insights intotemporal dynamics. To learn temporal relationships at different scales, the TFTutilizes recurrent layers for local processing and interpretable self-attentionlayers for learning long-term dependencies. The TFT also uses specializedcomponents for the judicious selection of relevant features and a series ofgating layers to suppress unnecessary components, enabling high performance ina wide range of regimes. On a variety of real-world datasets, we demonstratesignificant performance improvements over existing benchmarks, and showcasethree practical interpretability use-cases of TFT.