NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals

  • 2025-04-01 17:04:53
  • Jaden Fiotto-Kaufman, Alexander R. Loftus, Eric Todd, Jannik Brinkmann, Koyena Pal, Dmitrii Troitskii, Michael Ripa, Adam Belfki, Can Rager, Caden Juang, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Nikhil Prakash, Carla Brodley, Arjun Guha, Jonathan Bell, Byron C. Wallace, David Bau
  • 0

Abstract

We introduce NNsight and NDIF, technologies that work in tandem to enablescientific study of the representations and computations learned by very largeneural networks. NNsight is an open-source system that extends PyTorch tointroduce deferred remote execution. The National Deep Inference Fabric (NDIF)is a scalable inference service that executes NNsight requests, allowing usersto share GPU resources and pretrained models. These technologies are enabled bythe Intervention Graph, an architecture developed to decouple experimentaldesign from model runtime. Together, this framework provides transparent andefficient access to the internals of deep neural networks such as very largelanguage models (LLMs) without imposing the cost or complexity of hostingcustomized models individually. We conduct a quantitative survey of the machinelearning literature that reveals a growing gap in the study of the internals oflarge-scale AI. We demonstrate the design and use of our framework to addressthis gap by enabling a range of research methods on huge models. Finally, weconduct benchmarks to compare performance with previous approaches. Code, documentation, and tutorials are available at https://nnsight.net/.

 

Quick Read (beta)

loading the full paper ...