Question-Based Retrieval using Atomic Units for Enterprise RAG

Abstract

Enterprise retrieval augmented generation (RAG) offers a highly flexibleframework for combining powerful large language models (LLMs) with internal,possibly temporally changing, documents. In RAG, documents are first chunked.Relevant chunks are then retrieved for a user query, which are passed ascontext to a synthesizer LLM to generate the query response. However, theretrieval step can limit performance, as incorrect chunks can lead thesynthesizer LLM to generate a false response. This work applies a zero-shotadaptation of standard dense retrieval steps for more accurate chunk recall.Specifically, a chunk is first decomposed into atomic statements. A set ofsynthetic questions are then generated on these atoms (with the chunk as thecontext). Dense retrieval involves finding the closest set of syntheticquestions, and associated chunks, to the user query. It is found that retrievalwith the atoms leads to higher recall than retrieval with chunks. Furtherperformance gain is observed with retrieval using the synthetic questionsgenerated over the atoms. Higher recall at the retrieval step enables higherperformance of the enterprise LLM using the RAG pipeline.

Quick Read (beta)

loading the full paper ...