Advances in computer vision as well as increasingly widespread video-basedbehavioral monitoring have great potential for transforming how we study animalcognition and behavior. However, there is still a fairly large gap between theexciting prospects and what can actually be achieved in practice today,especially in videos from the wild. With this perspective paper, we want tocontribute towards closing this gap, by guiding behavioral scientists in whatcan be expected from current methods and steering computer vision researcherstowards problems that are relevant to advance research in animal behavior. Westart with a survey of the state-of-the-art methods for computer visionproblems that are directly relevant to the video-based study of animalbehavior, including object detection, multi-individual tracking, (inter)actionrecognition and individual identification. We then review methods foreffort-efficient learning, which is one of the biggest challenges from apractical perspective. Finally, we close with an outlook into the future of theemerging field of computer vision for animal behavior, where we argue that thefield should move fast beyond the common frame-by-frame processing and treatvideo as a first-class citizen.