Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment

  • 2025-09-05 17:52:53
  • Jiahuan Pei, Fanghua Ye, Xin Sun, Wentao Deng, Koen Hindriks, Junxiao Wang
  • 0

Abstract

Large language models (LLMs) have advanced virtual educators and learners,bridging NLP with AI4Education. Existing work often lacks scalability and failsto leverage diverse, large-scale course content, with limited frameworks forassessing pedagogic quality. To this end, we propose WikiHowAgent, amulti-agent workflow leveraging LLMs to simulate interactive teaching-learningconversations. It integrates teacher and learner agents, an interactionmanager, and an evaluator to facilitate procedural learning and assesspedagogic quality. We introduce a dataset of 114,296 teacher-learnerconversations grounded in 14,287 tutorials across 17 domains and 727 topics.Our evaluation protocol combines computational and rubric-based metrics withhuman judgment alignment. Results demonstrate the workflow's effectiveness indiverse setups, offering insights into LLM capabilities across domains. Ourdatasets and implementations are fully open-sourced.

 

Quick Read (beta)

loading the full paper ...