BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks

Abstract

The rapid growth of Large Language Models (LLMs) raises concerns aboutdistinguishing AI-generated text from human content. Existing watermarkingtechniques, like \kgw, struggle with low watermark strength and stringentfalse-positive requirements. Our analysis reveals that current methods rely oncoarse estimates of non-watermarked text, limiting watermark detectability. Toaddress this, we propose Bipolar Watermark (\tool), which splits generated textinto positive and negative poles, enhancing detection without requiringadditional computational resources or knowledge of the prompt. Theoreticalanalysis and experimental results demonstrate \tool's effectiveness andcompatibility with existing optimization techniques, providing a newoptimization dimension for watermarking in LLM-generated content.

Quick Read (beta)

loading the full paper ...