Abstract
Automated drug discovery offers significant potential for accelerating thedevelopment of novel therapeutics by substituting labor-intensive humanworkflows with machine-driven processes. However, a critical bottleneckpersists in the inability of current automated frameworks to assess whethernewly designed molecules infringe upon existing patents, posing significantlegal and financial risks. We introduce PatentFinder, a novel tool-enhanced andmulti-agent framework that accurately and comprehensively evaluates smallmolecules for patent infringement. It incorporates both heuristic andmodel-based tools tailored for decomposed subtasks, featuring: MarkushParser,which is capable of optical chemical structure recognition of molecular andMarkush structures, and MarkushMatcher, which enhances large language models'ability to extract substituent groups from molecules accurately. On ourbenchmark dataset MolPatent-240, PatentFinder outperforms baseline approachesthat rely solely on large language models, demonstrating a 13.8\% increase inF1-score and a 12\% rise in accuracy. Experimental results demonstrate thatPatentFinder mitigates label bias to produce balanced predictions andautonomously generates detailed, interpretable patent infringement reports.This work not only addresses a pivotal challenge in automated drug discoverybut also demonstrates the potential of decomposing complex scientific tasksinto manageable subtasks for specialized, tool-augmented agents.