Abstract
Equipping multi-fingered robots with tactile sensing is crucial for achievingthe precise, contact-rich, and dexterous manipulation that humans excel at.However, relying solely on tactile sensing fails to provide adequate cues forreasoning about objects' spatial configurations, limiting the ability tocorrect errors and adapt to changing situations. In this paper, we presentTactile Adaptation from Visual Incentives (TAVI), a new framework that enhancestactile-based dexterity by optimizing dexterous policies using vision-basedrewards. First, we use a contrastive-based objective to learn visualrepresentations. Next, we construct a reward function using these visualrepresentations through optimal-transport based matching on one humandemonstration. Finally, we use online reinforcement learning on our robot tooptimize tactile-based policies that maximize the visual reward. On sixchallenging tasks, such as peg pick-and-place, unstacking bowls, and flippingslender objects, TAVI achieves a success rate of 73% using our four-fingeredAllegro robot hand. The increase in performance is 108% higher than policiesusing tactile and vision-based rewards and 135% higher than policies withouttactile observational input. Robot videos are best viewed on our projectwebsite: https://see-to-touch.github.io/.