Abstract
Large language models (LLMs) have significantly improved code generation,particularly in one-pass code generation. However, most existing approachesfocus solely on generating code in a single programming language, overlookingthe potential of leveraging the multi-language capabilities of LLMs. LLMs havevarying patterns of errors across different languages, suggesting that a morerobust approach could be developed by leveraging these multi-language outputs.In this study, we propose Multi-Programming Language Ensemble (MPLE), a novelensemble-based method that utilizes code generation across multiple programminglanguages to enhance overall performance. By treating each language-specificcode generation process as an individual "weak expert" and effectivelyintegrating their outputs, our method mitigates language-specific errors andbiases. This multi-language ensemble strategy leverages the complementarystrengths of different programming languages, enabling the model to producemore accurate and robust code. Our approach can be seamlessly integrated withcommonly used techniques such as the reflection algorithm and Monte Carlo treesearch to improve code generation quality further. Experimental results showthat our framework consistently enhances baseline performance by up to 17.92%on existing benchmarks (HumanEval and HumanEval-plus), with a standout resultof 96.25% accuracy on the HumanEval benchmark, achieving new state-of-the-artresults across various LLM models. The code will be released athttps://github.com/NinjaTech-AI/MPLE