Unveiling the potential of large language models in generating semantic and cross-language clones

Abstract

Semantic and Cross-language code clone generation may be useful for codereuse, code comprehension, refactoring and benchmarking. OpenAI's GPT model haspotential in such clone generation as GPT is used for text generation. Whendevelopers copy/paste codes from Stack Overflow (SO) or within a system, theremight be inconsistent changes leading to unexpected behaviours. Similarly, ifsomeone possesses a code snippet in a particular programming language but seeksequivalent functionality in a different language, a semantic cross-languagecode clone generation approach could provide valuable assistance.In this study,using SemanticCloneBench as a vehicle, we evaluated how well the GPT-3 modelcould help generate semantic and cross-language clone variants for a givenfragment.We have comprised a diverse set of code fragments and assessed GPT-3sperformance in generating code variants.Through extensive experimentation andanalysis, where 9 judges spent 158 hours to validate, we investigate themodel's ability to produce accurate and semantically correct variants. Ourfindings shed light on GPT-3's strengths in code generation, offering insightsinto the potential applications and challenges of using advanced languagemodels in software development. Our quantitative analysis yields compellingresults. In the realm of semantic clones, GPT-3 attains an impressive accuracyof 62.14% and 0.55 BLEU score, achieved through few-shot prompt engineering.Furthermore, the model shines in transcending linguistic confines, boasting anexceptional 91.25% accuracy in generating cross-language clones

Quick Read (beta)

loading the full paper ...