Towards Effective Time-Aware Language Representation: Exploring Enhanced Temporal Understanding in Language Models

  • 2024-06-04 01:30:37
  • Jiexin Wang, Adam Jatowt, Yi Cai
  • 0

Abstract

In the evolving field of Natural Language Processing, understanding thetemporal context of text is increasingly crucial. This study investigatesmethods to incorporate temporal information during pre-training, aiming toachieve effective time-aware language representation for improved performanceon time-related tasks. In contrast to common pre-trained models like BERT,which rely on synchronic document collections such as BookCorpus and Wikipedia,our research introduces BiTimeBERT 2.0, a novel language model pre-trained on atemporal news article collection. BiTimeBERT 2.0 utilizes this temporal newscollection, focusing on three innovative pre-training objectives: Time-AwareMasked Language Modeling (TAMLM), Document Dating (DD), and Time-SensitiveEntity Replacement (TSER). Each objective targets a unique aspect of temporalinformation. TAMLM is designed to enhance the understanding of temporalcontexts and relations, DD integrates document timestamps as chronologicalmarkers, and TSER focuses on the temporal dynamics of "Person" entities,recognizing their inherent temporal significance. The experimental resultsconsistently demonstrate that BiTimeBERT 2.0 outperforms models like BERT andother existing pre-trained models, achieving substantial gains across a varietyof downstream NLP tasks and applications where time plays a pivotal role.

 

Quick Read (beta)

loading the full paper ...