Abstract
Retrieval augmented generation (RAG) models, which integrate large-scalepre-trained generative models with external retrieval mechanisms, have shownsignificant success in various natural language processing (NLP) tasks.However, applying RAG models in Persian language as a low-resource language,poses distinct challenges. These challenges primarily involve thepreprocessing, embedding, retrieval, prompt construction, language modeling,and response evaluation of the system. In this paper, we address the challengestowards implementing a real-world RAG system for Persian language calledPersianRAG. We propose novel solutions to overcome these obstacles and evaluateour approach using several Persian benchmark datasets. Our experimental resultsdemonstrate the capability of the PersianRAG framework to enhance questionanswering task in Persian.