EEV: A Large-Scale Dataset for Studying Evoked Expressions from Video

Abstract

Videos can evoke a range of affective responses in viewers. The ability topredict evoked affect from a video, before viewers watch the video, can help incontent creation and video recommendation. We introduce the Evoked Expressionsfrom Videos (EEV) dataset, a large-scale dataset for studying viewer responsesto videos. Each video is annotated at 6 Hz with 15 continuous evoked expressionlabels, corresponding to the facial expression of viewers who reacted to thevideo. We use an expression recognition model within our data collectionframework to achieve scalability. In total, there are 36.7 million annotationsof viewer facial reactions to 23,574 videos (1,700 hours). We use a publiclyavailable video corpus to obtain a diverse set of video content. We establishbaseline performance on the EEV dataset using an existing multimodal recurrentmodel. Transfer learning experiments show an improvement in performance on theLIRIS-ACCEDE video dataset when pre-trained on EEV. We hope that the size anddiversity of the EEV dataset will encourage further explorations in videounderstanding and affective computing. A subset of EEV is released athttps://github.com/google-research-datasets/eev.

Quick Read (beta)

loading the full paper ...