PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning

Abstract

With the popularity of large language models (LLMs), undesirable societalproblems like misinformation production and academic misconduct have been moresevere, making LLM-generated text detection now of unprecedented importance.Although existing methods have made remarkable progress, a new challenge posedby text from privately tuned LLMs remains underexplored. Users could easilypossess private LLMs by fine-tuning an open-source one with private corpora,resulting in a significant performance drop of existing detectors in practice.To address this issue, we propose PhantomHunter, an LLM-generated text detectorspecialized for detecting text from unseen, privately-tuned LLMs. Itsfamily-aware learning framework captures family-level traits shared across thebase models and their derivatives, instead of memorizing individualcharacteristics. Experiments on data from LLaMA, Gemma, and Mistral familiesshow its superiority over 7 baselines and 3 industrial services, with F1 scoresof over 96%.

Quick Read (beta)

loading the full paper ...