Identifying Query-Relevant Neurons in Large Language Models for Long-Form Texts

Abstract

Large Language Models (LLMs) possess vast amounts of knowledge within theirparameters, prompting research into methods for locating and editing thisknowledge. Previous work has largely focused on locating entity-related (oftensingle-token) facts in smaller models. However, several key questions remainunanswered: (1) How can we effectively locate query-relevant neurons incontemporary autoregressive LLMs, such as Llama and Mistral? (2) How can weaddress the challenge of long-form text generation? (3) Are there localizedknowledge regions in LLMs? In this study, we introduce Query-Relevant NeuronCluster Attribution (QRNCA), a novel architecture-agnostic framework capable ofidentifying query-relevant neurons in LLMs. QRNCA allows for the examination oflong-form answers beyond triplet facts by employing the proxy task ofmulti-choice question answering. To evaluate the effectiveness of our detectedneurons, we build two multi-choice QA datasets spanning diverse domains andlanguages. Empirical evaluations demonstrate that our method outperformsbaseline methods significantly. Further, analysis of neuron distributionsreveals the presence of visible localized regions, particularly withindifferent domains. Finally, we show potential applications of our detectedneurons in knowledge editing and neuron-based prediction.

Quick Read (beta)

loading the full paper ...