RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

  • 2024-01-31 18:30:21
  • Parth Sarthi, Salman Abdullah, Aditi Tuli, Shubh Khanna, Anna Goldie, Christopher D. Manning
  • 0

Abstract

Retrieval-augmented language models can better adapt to changes in worldstate and incorporate long-tail knowledge. However, most existing methodsretrieve only short contiguous chunks from a retrieval corpus, limitingholistic understanding of the overall document context. We introduce the novelapproach of recursively embedding, clustering, and summarizing chunks of text,constructing a tree with differing levels of summarization from the bottom up.At inference time, our RAPTOR model retrieves from this tree, integratinginformation across lengthy documents at different levels of abstraction.Controlled experiments show that retrieval with recursive summaries offerssignificant improvements over traditional retrieval-augmented LMs on severaltasks. On question-answering tasks that involve complex, multi-step reasoning,we show state-of-the-art results; for example, by coupling RAPTOR retrievalwith the use of GPT-4, we can improve the best performance on the QuALITYbenchmark by 20% in absolute accuracy.

 

Quick Read (beta)

loading the full paper ...