Abstract
We present Eir Thai Medical LLM, a large language model with 8 billionparameters, specifically designed to enhance the accuracy of handling medicaltasks in the Thai language. This model focuses on providing clear andeasy-to-understand answers for both healthcare professionals and patients,thereby improving the efficiency of diagnosis and treatment processes. Humanevaluation was conducted to ensure that the model adheres to care standards andprovides unbiased answers. To prioritize data security, the model is deployed within the hospital'sinternal network, ensuring both high security and faster processing speeds. Theinternal API connection is secured with encryption and strict authenticationmeasures to prevent data leaks and unauthorized access. We evaluated several open-source large language models with 8 billionparameters on four medical benchmarks: MedQA, MedMCQA, PubMedQA, and themedical subset of MMLU. The best-performing baselines were used to develop EirThai Medical LLM. Our evaluation employed multiple questioning strategies,including zero-shot, few-shot, chain-of-thought reasoning, andensemble/self-consistency voting methods. Our model outperformed commerciallyavailable Thai-language large language models by more than 10%. In addition, wedeveloped enhanced model testing tailored for clinical use in Thai across 18clinical tasks, where our model exceeded GPT-4o performance by more than 11%