Abstract
Legal artificial intelligence (LegalAI) aims to benefit legal systems withthe technology of artificial intelligence, especially natural languageprocessing (NLP). Recently, inspired by the success of pre-trained languagemodels (PLMs) in the generic domain, many LegalAI researchers devote theireffort to apply PLMs to legal tasks. However, utilizing PLMs to address legaltasks is still challenging, as the legal documents usually consist of thousandsof tokens, which is far longer than the length that mainstream PLMs canprocess. In this paper, we release the Longformer-based pre-trained languagemodel, named as Lawformer, for Chinese legal long documents understanding. Weevaluate Lawformer on a variety of LegalAI tasks, including judgmentprediction, similar case retrieval, legal reading comprehension, and legalquestion answering. The experimental results demonstrate that our model canachieve promising improvement on tasks with long documents as inputs.