Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)

Abstract

Neural networks models for NLP are typically implemented without the explicitencoding of language rules and yet they are able to break one performancerecord after another. This has generated a lot of research interest ininterpreting the representations learned by these networks. We propose here anovel interpretation approach that relies on the only processing system we havethat does understand language: the human brain. We use brain imaging recordingsof subjects reading complex natural text to interpret word and sequenceembeddings from 4 recent NLP models - ELMo, USE, BERT and Transformer-XL. Westudy how their representations differ across layer depth, context length, andattention type. Our results reveal differences in the context-relatedrepresentations across these models. Further, in the transformer models, wefind an interaction between layer depth and context length, and between layerdepth and attention type. We finally hypothesize that altering BERT to betteralign with brain recordings would enable it to also better understand language.Probing the altered BERT using syntactic NLP tasks reveals that the model withincreased brain-alignment outperforms the original model. Cognitiveneuroscientists have already begun using NLP networks to study the brain, andthis work closes the loop to allow the interaction between NLP and cognitiveneuroscience to be a true cross-pollination.

Quick Read (beta)

loading the full paper ...