CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation

Abstract

Evaluating the degree of reproduction of copyright-protected content bylanguage models (LMs) is of significant interest to the AI and legalcommunities. Although both literal and non-literal similarities are consideredby courts when assessing the degree of reproduction, prior research has focusedonly on literal similarities. To bridge this gap, we introduce CopyBench, abenchmark designed to measure both literal and non-literal copying in LMgenerations. Using copyrighted fiction books as text sources, we provideautomatic evaluation protocols to assess literal and non-literal copying,balanced against the model utility in terms of the ability to recall facts fromthe copyrighted works and generate fluent completions. We find that, althoughliteral copying is relatively rare, two types of non-literal copying -- eventcopying and character copying -- occur even in models as small as 7Bparameters. Larger models demonstrate significantly more copying, with literalcopying rates increasing from 0.2% to 10.5% and non-literal copying from 2.3%to 6.9% when comparing Llama3-8B and 70B models, respectively. We furtherevaluate the effectiveness of current strategies for mitigating copying andshow that (1) training-time alignment can reduce literal copying but mayincrease non-literal copying, and (2) current inference-time mitigation methodsprimarily reduce literal but not non-literal copying.

Quick Read (beta)

loading the full paper ...