Abstract
Fuzzing is an effective bug-finding technique but it struggles with complexsystems like JavaScript engines that demand precise grammatical input.Recently, researchers have adopted language models for context-aware mutationin fuzzing to address this problem. However, existing techniques are limited inutilizing coverage guidance for fuzzing, which is rather performed in ablack-box manner. This paper presents a novel technique called CovRL(Coverage-guided Reinforcement Learning) that combines Large Language Models(LLMs) with reinforcement learning from coverage feedback. Our fuzzer,CovRL-Fuzz, integrates coverage feedback directly into the LLM by leveragingthe Term Frequency-Inverse Document Frequency (TF-IDF) method to construct aweighted coverage map. This map is key in calculating the fuzzing reward, whichis then applied to the LLM-based mutator through reinforcement learning.CovRL-Fuzz, through this approach, enables the generation of test cases thatare more likely to discover new coverage areas, thus improving vulnerabilitydetection while minimizing syntax and semantic errors, all without needingextra post-processing. Our evaluation results indicate that CovRL-Fuzzoutperforms the state-of-the-art fuzzers in terms of code coverage andbug-finding capabilities: CovRL-Fuzz identified 48 real-world security-relatedbugs in the latest JavaScript engines, including 39 previously unknownvulnerabilities and 11 CVEs.