We introduce and tackle the problem of zero-shot object detection (ZSD),which aims to detect object classes which are not observed during training. Wework with a challenging set of object classes, not restricting ourselves tosimilar and/or fine-grained categories cf. prior works on zero-shotclassification. We follow a principled approach by first adaptingvisual-semantic embeddings for ZSD. We then discuss the problems associatedwith selecting a background class and motivate two background-aware approachesfor learning robust detectors. One of these models uses a fixed backgroundclass and the other is based on iterative latent assignments. We also outlinethe challenge associated with using a limited number of training classes andpropose a solution based on dense sampling of the semantic label space usingauxiliary data with a large number of categories. We propose novel splits oftwo standard detection datasets - MSCOCO and VisualGenome and discuss extensiveempirical results to highlight the benefits of the proposed methods. We provideuseful insights into the algorithm and conclude by posing some open questionsto encourage further research.