Annotation Artifacts in Natural Language Inference Data

Abstract

Large-scale datasets for natural language inference are created by presentingcrowd workers with a sentence (premise), and asking them to generate three newsentences (hypotheses) that it entails, contradicts, or is logically neutralwith respect to. We show that, in a significant portion of such data, thisprotocol leaves clues that make it possible to identify the label by lookingonly at the hypothesis, without observing the premise. Specifically, we showthat a simple text categorization model can correctly classify the hypothesisalone in about 67% of SNLI (Bowman et. al, 2015) and 53% of MultiNLI (Williamset. al, 2017). Our analysis reveals that specific linguistic phenomena such asnegation and vagueness are highly correlated with certain inference classes.Our findings suggest that the success of natural language inference models todate has been overestimated, and that the task remains a hard open problem.

Quick Read (beta)

loading the full paper ...