Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study

Abstract

Neural sequence-to-sequence (seq2seq) approaches have proven to be successfulin grammatical error correction (GEC). Based on the seq2seq framework, wepropose a novel fluency boost learning and inference mechanism. Fluencyboosting learning generates diverse error-corrected sentence pairs duringtraining, enabling the error correction model to learn how to improve asentence's fluency from more instances, while fluency boosting inference allowsthe model to correct a sentence incrementally with multiple inference steps.Combining fluency boost learning and inference with convolutional seq2seqmodels, our approach achieves the state-of-the-art performance: 75.72 (F_{0.5})on CoNLL-2014 10 annotation dataset and 62.42 (GLEU) on JFLEG test setrespectively, becoming the first GEC system that reaches human-levelperformance (72.58 for CoNLL and 62.37 for JFLEG) on both of the benchmarks.

Quick Read (beta)

loading the full paper ...