Towards Effective Time-Aware Language Representation: Exploring Enhanced Temporal Understanding in Language Models

  • 2025-03-05 16:27:57
  • Jiexin Wang, Adam Jatowt, Yi Cai
  • 0

Abstract

In the evolving field of Natural Language Processing (NLP), understanding thetemporal context of text is increasingly critical for applications requiringadvanced temporal reasoning. Traditional pre-trained language models like BERT,which rely on synchronic document collections such as BookCorpus and Wikipedia,often fall short in effectively capturing and leveraging temporal information.To address this limitation, we introduce BiTimeBERT 2.0, a novel time-awarelanguage model pre-trained on a temporal news article collection. BiTimeBERT2.0 incorporates temporal information through three innovative pre-trainingobjectives: Extended Time-Aware Masked Language Modeling (ETAMLM), DocumentDating (DD), and Time-Sensitive Entity Replacement (TSER). Each objective isspecifically designed to target a distinct dimension of temporal information:ETAMLM enhances the model's understanding of temporal contexts and relations,DD integrates document timestamps as explicit chronological markers, and TSERfocuses on the temporal dynamics of "Person" entities. Moreover, our refinedcorpus preprocessing strategy reduces training time by nearly 53\%, makingBiTimeBERT 2.0 significantly more efficient while maintaining high performance.Experimental results show that BiTimeBERT 2.0 achieves substantial improvementsacross a broad range of time-related tasks and excels on datasets spanningextensive temporal ranges. These findings underscore BiTimeBERT 2.0's potentialas a powerful tool for advancing temporal reasoning in NLP.

 

Quick Read (beta)

loading the full paper ...