SVIP: Towards Verifiable Inference of Open-source Large Language Models

Abstract

Open-source Large Language Models (LLMs) have recently demonstratedremarkable capabilities in natural language understanding and generation,leading to widespread adoption across various domains. However, theirincreasing model sizes render local deployment impractical for individualusers, pushing many to rely on computing service providers for inferencethrough a blackbox API. This reliance introduces a new risk: a computingprovider may stealthily substitute the requested LLM with a smaller, lesscapable model without consent from users, thereby delivering inferior outputswhile benefiting from cost savings. In this paper, we formalize the problem ofverifiable inference for LLMs. Existing verifiable computing solutions based oncryptographic or game-theoretic techniques are either computationallyuneconomical or rest on strong assumptions. We introduce SVIP, a secret-basedverifiable LLM inference protocol that leverages intermediate outputs from LLMas unique model identifiers. By training a proxy task on these outputs andrequiring the computing provider to return both the generated text and theprocessed intermediate outputs, users can reliably verify whether the computingprovider is acting honestly. In addition, the integration of a secret mechanismfurther enhances the security of our protocol. We thoroughly analyze ourprotocol under multiple strong and adaptive adversarial scenarios. Ourextensive experiments demonstrate that SVIP is accurate, generalizable,computationally efficient, and resistant to various attacks. Notably, SVIPachieves false negative rates below 5% and false positive rates below 3%, whilerequiring less than 0.01 seconds per query for verification.

Quick Read (beta)

loading the full paper ...