GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse state

  • 2023-01-20 06:13:46
  • Junyi Bian, Xiaodi Huang, Hong Zhou, Shanfeng Zhu
  0


Extracting summaries from long documents can be regarded as sentenceclassification using the structural information of the documents. How to usesuch structural information to summarize a document is challenging. In thispaper, we propose GoSum, a novel graph and reinforcement learning basedextractive model for long-paper summarization. In particular, GoSum encodessentence states in reinforcement learning by building a heterogeneous graph foreach input document at different discourse levels. An edge in the graphreflects the discourse hierarchy of a document for restraining the semanticdrifts across section boundaries. We evaluate GoSum on two datasets ofscientific articles summarization: PubMed and arXiv. The experimental resultshave demonstrated that GoSum achieve state-of-the-art results compared withstrong baselines of both extractive and abstractive models. The ablationstudies further validate that the performance of our GoSum benefits from theuse of discourse information.


