Abstract
Imitation learning advances robot capabilities by enabling the acquisition ofdiverse behaviors from human demonstrations. However, large-scale datasets usedfor policy training often introduce substantial variability in quality, whichcan negatively impact performance. As a result, automatically curating datasetsby filtering low-quality samples to improve quality becomes essential. Existingrobotic curation approaches rely on costly manual annotations and performcuration at a coarse granularity, such as the dataset or trajectory level,failing to account for the quality of individual state-action pairs. To addressthis, we introduce SCIZOR, a self-supervised data curation framework thatfilters out low-quality state-action pairs to improve the performance ofimitation learning policies. SCIZOR targets two complementary sources oflow-quality data: suboptimal data, which hinders learning with undesirableactions, and redundant data, which dilutes training with repetitive patterns.SCIZOR leverages a self-supervised task progress predictor for suboptimal datato remove samples lacking task progression, and a deduplication moduleoperating on joint state-action representation for samples with redundantpatterns. Empirically, we show that SCIZOR enables imitation learning policiesto achieve higher performance with less data, yielding an average improvementof 15.4% across multiple benchmarks. More information is available at:https://ut-austin-rpl.github.io/SCIZOR/