Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling

Abstract

Predicting and executing a sequence of actions without intermediatereplanning, known as action chunking, is increasingly used in robot learningfrom human demonstrations. Yet, its reported effects on the learned policy areinconsistent: some studies find it crucial for achieving strong results, whileothers observe decreased performance. In this paper, we first dissect howaction chunking impacts the divergence between a learner and a demonstrator. Wefind that action chunking allows the learner to better capture the temporaldependencies in demonstrations but at the cost of reduced reactivity instochastic environments. To address this tradeoff, we propose BidirectionalDecoding (BID), a test-time inference algorithm that bridges action chunkingwith closed-loop operations. BID samples multiple predictions at each time stepand searches for the optimal one based on two criteria: (i) backward coherence,which favors samples that align with previous decisions; (ii) forward contrast,which seeks samples of high likelihood for future plans. By coupling decisionswithin and across action chunks, BID promotes consistency over time whilemaintaining reactivity to unexpected changes. Experimental results show thatBID boosts the performance of two state-of-the-art generative policies acrossseven simulation benchmarks and two real-world tasks. Code and videos areavailable at https://bid-robot.github.io.

Quick Read (beta)

loading the full paper ...