Generating Wikipedia by Summarizing Long Sequences

Abstract

We show that generating English Wikipedia articles can be approached as amulti- document summarization of source documents. We use extractivesummarization to coarsely identify salient information and a neural abstractivemodel to generate the article. For the abstractive model, we introduce adecoder-only architecture that can scalably attend to very long sequences, muchlonger than typical encoder- decoder architectures used in sequencetransduction. We show that this model can generate fluent, coherentmulti-sentence paragraphs and even whole Wikipedia articles. When givenreference documents, we show it can extract relevant factual information asreflected in perplexity, ROUGE scores and human evaluations.

Quick Read (beta)

loading the full paper ...