Demystifying Neural Language Models' Insensitivity to Word-Order

Abstract

Recent research analyzing the sensitivity of natural language understandingmodels to word-order perturbations have shown that the state-of-the-art modelsin several language tasks may have a unique way to understand the text thatcould seldom be explained with conventional syntax and semantics. In thispaper, we investigate the insensitivity of natural language models toword-order by quantifying perturbations and analysing their effect on neuralmodels' performance on language understanding tasks in GLUE benchmark. Towardsthat end, we propose two metrics - the Direct Neighbour Displacement (DND) andthe Index Displacement Count (IDC) - that score the local and global orderingof tokens in the perturbed texts and observe that perturbation functions foundin prior literature affect only the global ordering while the local orderingremains relatively unperturbed. We propose perturbations at the granularity ofsub-words and characters to study the correlation between DND, IDC and theperformance of neural language models on natural language tasks. We find thatneural language models - pretrained and non-pretrained Transformers, LSTMs, andConvolutional architectures - require local ordering more so than the globalordering of tokens. The proposed metrics and the suite of perturbations allow asystematic way to study the (in)sensitivity of neural language understandingmodels to varying degree of perturbations.

Quick Read (beta)

loading the full paper ...