A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge

  • 2023-10-18 05:31:06
  • Yikun Han, Chunjiang Liu, Pengfei Wang
  • 0

Abstract

A vector database is used to store high-dimensional data that cannot becharacterized by traditional DBMS. Although there are not many articlesdescribing existing or introducing new vector database architectures, theapproximate nearest neighbor search problem behind vector databases has beenstudied for a long time, and considerable related algorithmic articles can befound in the literature. This article attempts to comprehensively reviewrelevant algorithms to provide a general understanding of this booming researcharea. The basis of our framework categorises these studies by the approach ofsolving ANNS problem, respectively hash-based, tree-based, graph-based andquantization-based approaches. Then we present an overview of existingchallenges for vector databases. Lastly, we sketch how vector databases can becombined with large language models and provide new possibilities.

 

Quick Read (beta)

loading the full paper ...