Towards General Purpose Vision Systems

  • 2021-04-01 19:35:21
  • Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, Derek Hoiem
  • 73

Abstract

A special purpose learning system assumes knowledge of admissible tasks atdesign time. Adapting such a system to unforeseen tasks requires architecturemanipulation such as adding an output head for each new task or dataset. Inthis work, we propose a task-agnostic vision-language system that accepts animage and a natural language task description and outputs bounding boxes,confidences, and text. The system supports a wide range of vision tasks such asclassification, localization, question answering, captioning, and more. Weevaluate the system's ability to learn multiple skills simultaneously, toperform tasks with novel skill-concept combinations, and to learn new skillsefficiently and without forgetting.

 

Quick Read (beta)

loading the full paper ...