Multi-agent behavior modeling aims to understand the interactions that occurbetween agents. We present a multi-agent dataset from behavioral neuroscience,the Caltech Mouse Social Interactions (CalMS21) Dataset. Our dataset consistsof trajectory data of social interactions, recorded from videos of freelybehaving mice in a standard resident-intruder assay. The CalMS21 dataset ispart of the Multi-Agent Behavior Challenge 2021 and for our next step, our goalis to incorporate datasets from other domains studying multi-agent behavior. To help accelerate behavioral studies, the CalMS21 dataset provides abenchmark to evaluate the performance of automated behavior classificationmethods in three settings: (1) for training on large behavioral datasets allannotated by a single annotator, (2) for style transfer to learninter-annotator differences in behavior definitions, and (3) for learning ofnew behaviors of interest given limited training data. The dataset consists of6 million frames of unlabelled tracked poses of interacting mice, as well asover 1 million frames with tracked poses and corresponding frame-level behaviorannotations. The challenge of our dataset is to be able to classify behaviorsaccurately using both labelled and unlabelled tracking data, as well as beingable to generalize to new annotators and behaviors.