Abstract
Web agents have emerged as a promising direction to automate Web taskcompletion based on user instructions, significantly enhancing user experience.Recently, Web agents have evolved from traditional agents to Large LanguageModels (LLMs)-based Web agents. Despite their success, existing LLM-based Webagents overlook the importance of personalized data (e.g., user profiles andhistorical Web behaviors) in assisting the understanding of users' personalizedinstructions and executing customized actions. To overcome the limitation, wefirst formulate the task of LLM-empowered personalized Web agents, whichintegrate personalized data and user instructions to personalize instructioncomprehension and action execution. To address the absence of a comprehensiveevaluation benchmark, we construct a Personalized Web Agent Benchmark(PersonalWAB), featuring user instructions, personalized user data, Webfunctions, and two evaluation paradigms across three personalized Web tasks.Moreover, we propose a Personalized User Memory-enhanced Alignment (PUMA)framework to adapt LLMs to the personalized Web agent task. PUMA utilizes amemory bank with a task-specific retrieval strategy to filter relevanthistorical Web behaviors. Based on the behaviors, PUMA then aligns LLMs forpersonalized action execution through fine-tuning and direct preferenceoptimization. Extensive experiments validate the superiority of PUMA overexisting Web agents on PersonalWAB.