Pretraining and Updating Language- and Domain-specific Large Language Model: A Case Study in Japanese Business Domain

  • 2024-04-16 03:24:00
  • Kosuke Takahashi, Takahiro Omi, Kosuke Arima, Tatsuya Ishigaki
  • 0

Abstract

Several previous studies have considered language- and domain-specific largelanguage models (LLMs) as separate topics. This study explores the combinationof a non-English language and a high-demand industry domain, focusing on aJapanese business-specific LLM. This type of a model requires expertise in thebusiness domain, strong language skills, and regular updates of its knowledge.We trained a 13-billion-parameter LLM from scratch using a new dataset ofbusiness texts and patents, and continually pretrained it with the latestbusiness documents. Further we propose a new benchmark for Japanese businessdomain question answering (QA) and evaluate our models on it. The results showthat our pretrained model improves QA accuracy without losing generalknowledge, and that continual pretraining enhances adaptation to newinformation. Our pretrained model and business domain benchmark are publiclyavailable.

 

Quick Read (beta)

loading the full paper ...