Natural Language Understanding with the Quora Question Pairs Dataset

  • 2019-07-01 19:48:34
  • Lakshay Sharma, Laura Graesser, Nikita Nangia, Utku Evci
  • 1

Abstract

This paper explores the task Natural Language Understanding (NLU) by lookingat duplicate question detection in the Quora dataset. We conducted extensiveexploration of the dataset and used various machine learning models, includinglinear and tree-based models. Our final finding was that a simple ContinuousBag of Words neural network model had the best performance, outdoing morecomplicated recurrent and attention based models. We also conducted erroranalysis and found some subjectivity in the labeling of the dataset.

 

Quick Read (beta)

loading the full paper ...