Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models

Abstract

Large language models (LLMs) have dramatically advanced machine learningresearch including natural language processing, computer vision, data mining,etc., yet they still exhibit critical limitations in reasoning, factualconsistency, and interpretability. In this paper, we introduce a novel learningparadigm -- Modular Machine Learning (MML) -- as an essential approach towardnew-generation LLMs. MML decomposes the complex structure of LLMs into threeinterdependent components: modular representation, modular model, and modularreasoning, aiming to enhance LLMs' capability of counterfactual reasoning,mitigating hallucinations, as well as promoting fairness, safety, andtransparency. Specifically, the proposed MML paradigm can: i) clarify theinternal working mechanism of LLMs through the disentanglement of semanticcomponents; ii) allow for flexible and task-adaptive model design; iii) enableinterpretable and logic-driven decision-making process. We present a feasibleimplementation of MML-based LLMs via leveraging advanced techniques such asdisentangled representation learning, neural architecture search andneuro-symbolic learning. We critically identify key challenges, such as theintegration of continuous neural and discrete symbolic processes, jointoptimization, and computational scalability, present promising future researchdirections that deserve further exploration. Ultimately, the integration of theMML paradigm with LLMs has the potential to bridge the gap between statistical(deep) learning and formal (logical) reasoning, thereby paving the way forrobust, adaptable, and trustworthy AI systems across a wide range of real-worldapplications.

Quick Read (beta)

loading the full paper ...