AIDBench: A benchmark for evaluating the authorship identification capability of large language models

Abstract

As large language models (LLMs) rapidly advance and integrate into dailylife, the privacy risks they pose are attracting increasing attention. We focuson a specific privacy risk where LLMs may help identify the authorship ofanonymous texts, which challenges the effectiveness of anonymity in real-worldsystems such as anonymous peer review systems. To investigate these risks, wepresent AIDBench, a new benchmark that incorporates several authoridentification datasets, including emails, blogs, reviews, articles, andresearch papers. AIDBench utilizes two evaluation methods: one-to-oneauthorship identification, which determines whether two texts are from the sameauthor; and one-to-many authorship identification, which, given a query textand a list of candidate texts, identifies the candidate most likely written bythe same author as the query text. We also introduce a Retrieval-AugmentedGeneration (RAG)-based method to enhance the large-scale authorshipidentification capabilities of LLMs, particularly when input lengths exceed themodels' context windows, thereby establishing a new baseline for authorshipidentification using LLMs. Our experiments with AIDBench demonstrate that LLMscan correctly guess authorship at rates well above random chance, revealing newprivacy risks posed by these powerful models. The source code and data will bemade publicly available after acceptance.

Quick Read (beta)

loading the full paper ...