Vision science, particularly machine vision, has been revolutionized byintroducing large-scale image datasets and statistical learning approaches.Yet, human neuroimaging studies of visual perception still rely on smallnumbers of images (around 100) due to time-constrained experimental procedures.To apply statistical learning approaches that integrate neuroscience, thenumber of images used in neuroimaging must be significantly increased. Wepresent BOLD5000, a human functional MRI (fMRI) study that includes almost5,000 distinct images depicting real-world scenes. Beyond dramaticallyincreasing image dataset size relative to prior fMRI studies, BOLD5000 alsoaccounts for image diversity, overlapping with standard computer visiondatasets by incorporating images from the Scene UNderstanding (SUN), CommonObjects in Context (COCO), and ImageNet datasets. The scale and diversity ofthese image datasets, combined with a slow event-related fMRI design, enablefine-grained exploration into the neural representation of a wide range ofvisual features, categories, and semantics. Concurrently, BOLD5000 brings uscloser to realizing Marr's dream of a singular vision science - the intertwinedstudy of biological and computer vision.