Abstract
Large language models (LLMs) have demonstrated a remarkable ability to serveas general-purpose tools for various language-based tasks. Recent works havedemonstrated that the efficacy of such models can be improved through iterativedialog between multiple models. While these paradigms show promise in improvingmodel efficacy, most works in this area treat collaboration as an emergentbehavior, rather than a learned behavior. In doing so, current multi-agentframeworks rely on collaborative behaviors to have been sufficiently trainedinto off-the-shelf models. To address this limitation, we propose ACC-Collab,an Actor-Critic based learning framework to produce a two-agent team (anactor-agent and a critic-agent) specialized in collaboration. We demonstratethat ACC-Collab outperforms SotA multi-agent techniques on a wide array ofbenchmarks.