Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation

  • 2025-07-28 16:00:01
  • Robin D. Pesl, Jerin G. Mathew, Massimo Mecella, Marco Aiello
  • 0

Abstract

Integrating multiple (sub-)systems is essential to create advancedInformation Systems (ISs). Difficulties mainly arise when integrating dynamicenvironments across the IS lifecycle. A traditional approach is a registry thatprovides the API documentation of the systems' endpoints. Large Language Models(LLMs) have shown to be capable of automatically creating system integrations(e.g., as service composition) based on this documentation but require conciseinput due to input token limitations, especially regarding comprehensive APIdescriptions. Currently, it is unknown how best to preprocess these APIdescriptions. Within this work, we (i) analyze the usage of Retrieval AugmentedGeneration (RAG) for endpoint discovery and the chunking, i.e., preprocessing,of OpenAPIs to reduce the input token length while preserving the most relevantinformation. To further reduce the input token length for the compositionprompt and improve endpoint retrieval, we propose (ii) a Discovery Agent thatonly receives a summary of the most relevant endpoints and retrieves details ondemand. We evaluate RAG for endpoint discovery using the RestBench benchmark,first, for the different chunking possibilities and parameters measuring theendpoint retrieval recall, precision, and F1 score. Then, we assess theDiscovery Agent using the same test set. With our prototype, we demonstrate howto successfully employ RAG for endpoint discovery to reduce the token count.While revealing high values for recall, precision, and F1, further research isnecessary to retrieve all requisite endpoints. Our experiments show that forpreprocessing, LLM-based and format-specific approaches outperform na\"ivechunking methods. Relying on an agent further enhances these results as theagent splits the tasks into multiple fine granular subtasks, improving theoverall RAG performance in the token count, precision, and F1 score.

 

Quick Read (beta)

loading the full paper ...