LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language

Abstract

Large Language Models represent state-of-the-art linguistic models designedto equip computers with the ability to comprehend natural language. With itsexceptional capacity to capture complex contextual relationships, the LLaMA(Large Language Model Meta AI) family represents a novel advancement in thefield of natural language processing by releasing foundational models designedto improve the natural language understanding abilities of the transformerarchitecture thanks to their large amount of trainable parameters (7, 13, and70 billion parameters). In many natural language understanding tasks, thesemodels obtain the same performances as private company models such as OpenAIChat-GPT with the advantage to make publicly available weights and code forresearch and commercial uses. In this work, we investigate the possibility ofLanguage Adaptation for LLaMA models, explicitly focusing on addressing thechallenge of Italian Language coverage. Adopting an open science approach, weexplore various tuning approaches to ensure a high-quality text generated inItalian suitable for common tasks in this underrepresented language in theoriginal models' datasets. We aim to release effective text generation modelswith strong linguistic properties for many tasks that seem challenging usingmultilingual or general-purpose LLMs. By leveraging an open science philosophy,this study contributes to Language Adaptation strategies for the Italianlanguage by introducing the novel LLaMAntino family of Italian LLMs.

Quick Read (beta)

loading the full paper ...