Abstract
The efficacy of AI agents in healthcare research is hindered by theirreliance on static, predefined strategies. This creates a critical limitation:agents can become better tool-users but cannot learn to become better strategicplanners, a crucial skill for complex domains like healthcare. We introduceHealthFlow, a self-evolving AI agent that overcomes this limitation through anovel meta-level evolution mechanism. HealthFlow autonomously refines its ownhigh-level problem-solving policies by distilling procedural successes andfailures into a durable, strategic knowledge base. To anchor our research andfacilitate reproducible evaluation, we introduce EHRFlowBench, a new benchmarkfeaturing complex, realistic health data analysis tasks derived frompeer-reviewed clinical research. Our comprehensive experiments demonstrate thatHealthFlow's self-evolving approach significantly outperforms state-of-the-artagent frameworks. This work marks a necessary shift from building bettertool-users to designing smarter, self-evolving task-managers, paving the wayfor more autonomous and effective AI for scientific discovery.