Abstract
Recent advancements in developing Pre-trained Language Models for Code(Code-PLMs) have urged many areas of Software Engineering (SE) and broughtbreakthrough results for many SE tasks. Though these models have achieved thestate-of-the-art performance for SE tasks for many popular programminglanguages, such as Java and Python, the Scientific Software and its relatedlanguages like R programming language have rarely benefited or even beenevaluated with the Code-PLMs. Research has shown that R has many differenceswith other programming languages and requires specific techniques. In thisstudy, we provide the first insights for code intelligence for R. For thispurpose, we collect and open source an R dataset, and evaluate Code-PLMs forthe two tasks of code summarization and method name prediction using severalsettings and strategies, including the differences in two R styles, Tidy-verseand Base R. Our results demonstrate that the studied models have experiencedvarying degrees of performance degradation when processing R programminglanguage code, which is supported by human evaluation. Additionally, not allmodels show performance improvement in R-specific tasks even aftermulti-language fine-tuning. The dual syntax paradigms in R significantly impactthe models' performance, particularly in code summarization tasks. Furthermore,the project-specific context inherent in R codebases significantly impacts theperformance when attempting cross-project training.