Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding

Abstract

Large Language Models (LLMs) have demonstrated unprecedented capability incode generation. However, LLM-generated code is still plagued with a wide rangeof functional errors, especially for complex programming tasks that LLMs havenot seen before. Recent studies have shown that developers often struggle withinspecting and fixing incorrect code generated by LLMs, diminishing theirproductivity and trust in LLM-based code generation. Inspired by the mutualgrounding theory in communication, we propose an interactive approach thatleverages code comments as a medium for developers and LLMs to establish ashared understanding. Our approach facilitates iterative grounding byinterleaving code generation, inline comment generation, and contextualizeduser feedback through editable comments to align generated code with developerintent. We evaluated our approach on two popular benchmarks and demonstratedthat our approach significantly improved multiple state-of-the-art LLMs, e.g.,17.1% pass@1 improvement for code-davinci-002 on HumanEval. Furthermore, weconducted a user study with 12 participants in comparison to two baselines: (1)interacting with GitHub Copilot, and (2) interacting with a multi-step codegeneration paradigm called Multi-Turn Program Synthesis. Participants completedthe given programming tasks 16.7% faster and with 10.5% improvement in tasksuccess rate when using our approach. Both results show that interactivelyrefining code comments enables the collaborative establishment of mutualgrounding, leading to more accurate code generation and higher developerconfidence.

Quick Read (beta)

loading the full paper ...