Large Language Models for Judicial Entity Extraction: A Comparative Study

Abstract

Domain-specific Entity Recognition holds significant importance in legalcontexts, serving as a fundamental task that supports various applications suchas question-answering systems, text summarization, machine translation,sentiment analysis, and information retrieval specifically within case lawdocuments. Recent advancements have highlighted the efficacy of Large LanguageModels in natural language processing tasks, demonstrating their capability toaccurately detect and classify domain-specific facts (entities) fromspecialized texts like clinical and financial documents. This researchinvestigates the application of Large Language Models in identifyingdomain-specific entities (e.g., courts, petitioner, judge, lawyer, respondents,FIR nos.) within case law documents, with a specific focus on their aptitudefor handling domain-specific language complexity and contextual variations. Thestudy evaluates the performance of state-of-the-art Large Language Modelarchitectures, including Large Language Model Meta AI 3, Mistral, and Gemma, inthe context of extracting judicial facts tailored to Indian judicial texts.Mistral and Gemma emerged as the top-performing models, showcasing balancedprecision and recall crucial for accurate entity identification. These findingsconfirm the value of Large Language Models in judicial documents anddemonstrate how they can facilitate and quicken scientific research byproducing precise, organised data outputs that are appropriate for in-depthexamination.

Quick Read (beta)

loading the full paper ...