Generalized Object Search

Abstract

Future collaborative robots must be capable of finding objects. As such afundamental skill, we expect object search to eventually become anoff-the-shelf capability for any robot, similar to e.g., object detection,SLAM, and motion planning. However, existing approaches either make unrealisticcompromises (e.g., reduce the problem from 3D to 2D), resort to ad-hoc, greedysearch strategies, or attempt to learn end-to-end policies in simulation thatare yet to generalize across real robots and environments. This thesis arguesthat through using Partially Observable Markov Decision Processes (POMDPs) tomodel object search while exploiting structures in the human world (e.g.,octrees, correlations) and in human-robot interaction (e.g., spatial language),a practical and effective system for generalized object search can be achieved.In support of this argument, I develop methods and systems for (multi-)objectsearch in 3D environments under uncertainty due to limited field of view,occlusion, noisy, unreliable detectors, spatial correlations between objects,and possibly ambiguous spatial language (e.g., "The red car is behind ChaseBank"). Besides evaluation in simulators such as PyGame, AirSim, and AI2-THOR,I design and implement a robot-independent, environment-agnostic system forgeneralized object search in 3D and deploy it on the Boston Dynamics Spotrobot, the Kinova MOVO robot, and the Universal Robots UR5e robotic arm, toperform object search in different environments. The system enables, forexample, a Spot robot to find a toy cat hidden underneath a couch in a kitchenarea in under one minute. This thesis also broadly surveys the object searchliterature, proposing taxonomies in object search problem settings, methods andsystems.

Quick Read (beta)

loading the full paper ...