Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs

Abstract

Despite the impressive generative abilities of black-box large languagemodels (LLMs), their inherent opacity hinders further advancements incapabilities such as reasoning, planning, and personalization. Existing worksaim to enhance LLM capabilities via domain-specific adaptation, which requireadditional training on accessible model parameters, an infeasible option forblack-box LLMs. To address this challenge, we introduce Matryoshka Pilot(M-Pilot), a lightweight white-box LLM controller that guides a large-scaleblack-box LLM generator by decomposing complex tasks into a series ofintermediate outputs. Specifically, we consider the black-box LLM as anenvironment, with M-Pilot serving as a policy to provide intermediate guidancethrough prompts for driving the black-box LLM. M-Pilot is trained to pivot theoutputs of the black-box LLM aligning with preferences during iterativeinteraction, which enables controllable multi-turn generation andself-improvement in optimizing intermediate guidance. Empirical evaluations ondiverse tasks demonstrate that our method effectively enhances the capabilitiesof black-box LLMs in complex, long-horizon tasks. Our code is publiclyavailable at: https://github.com/lichangh20/Matryoshka.

Quick Read (beta)

loading the full paper ...