Abstract
Large-scale pre-trained language models have demonstrated high performance onstandard datasets for natural language inference (NLI) tasks. Unfortunately,these evaluations can be misleading, as although the models can perform well onin-distribution data, they perform poorly on out-of-distribution test sets,such as contrast sets. Contrast sets consist of perturbed instances of datathat have very minor, but meaningful, changes to the input that alter the goldlabel, revealing how models can learn superficial patterns in the training datarather than learning more sophisticated language nuances. As an example, theELECTRA-small language model achieves nearly 90% accuracy on an SNLI datasetbut drops to 75% when tested on an out-of-distribution contrast set. Theresearch carried out in this study explores how the robustness of a languagemodel can be improved by exposing it to small amounts of more complex contrastsets during training to help it better learn language patterns. With thisapproach, the model recovers performance and achieves nearly 90% accuracy oncontrast sets, highlighting the importance of diverse and challenging trainingdata.