Explainable Mapper: Charting LLM Embedding Spaces Using Perturbation-Based Explanation and Verification Agents

Abstract

Large language models (LLMs) produce high-dimensional embeddings that capturerich semantic and syntactic relationships between words, sentences, andconcepts. Investigating the topological structures of LLM embedding spaces viamapper graphs enables us to understand their underlying structures.Specifically, a mapper graph summarizes the topological structure of theembedding space, where each node represents a topological neighborhood(containing a cluster of embeddings), and an edge connects two nodes if theircorresponding neighborhoods overlap. However, manually exploring theseembedding spaces to uncover encoded linguistic properties requires considerablehuman effort. To address this challenge, we introduce a framework forsemi-automatic annotation of these embedding properties. To organize theexploration process, we first define a taxonomy of explorable elements within amapper graph such as nodes, edges, paths, components, and trajectories. Theannotation of these elements is executed through two types of customizableLLM-based agents that employ perturbation techniques for scalable and automatedanalysis. These agents help to explore and explain the characteristics ofmapper elements and verify the robustness of the generated explanations. Weinstantiate the framework within a visual analytics workspace and demonstrateits effectiveness through case studies. In particular, we replicate findingsfrom prior research on BERT's embedding properties across various layers of itsarchitecture and provide further observations into the linguistic properties oftopological neighborhoods.

Quick Read (beta)

loading the full paper ...