Temporal Attentive Alignment for Large-Scale Video Domain Adaptation

Abstract

Although various image-based domain adaptation (DA) techniques have beenproposed in recent years, domain shift in videos is still not well-explored.Most previous works only evaluate performance on small-scale datasets which aresaturated. Therefore, we first propose two large-scale video DA datasets withmuch larger domain discrepancy: UCF-HMDB_full and Kinetics-Gameplay. Second, weinvestigate different DA integration methods for videos, and show thatsimultaneously aligning and learning temporal dynamics achieves effectivealignment even without sophisticated DA methods. Finally, we propose TemporalAttentive Adversarial Adaptation Network (TA3N), which explicitly attends tothe temporal dynamics using domain discrepancy for more effective domainalignment, achieving state-of-the-art performance on four video DA datasets(e.g. 7.9% accuracy gain over "Source only" from 73.9% to 81.8% on "HMDB -->UCF", and 10.3% gain on "Kinetics --> Gameplay"). The code and data arereleased at http://github.com/cmhungsteve/TA3N.

Quick Read (beta)

loading the full paper ...