Abstract
We introduce string2string, an open-source library that offers acomprehensive suite of efficient algorithms for a broad range ofstring-to-string problems. It includes traditional algorithmic solutions aswell as recent advanced neural approaches to tackle various problems in stringalignment, distance measurement, lexical and semantic search, and similarityanalysis -- along with several helpful visualization tools and metrics tofacilitate the interpretation and analysis of these methods. Notable algorithmsfeatured in the library include the Smith-Waterman algorithm for pairwise localalignment, the Hirschberg algorithm for global alignment, the Wagner-Fisheralgorithm for edit distance, BARTScore and BERTScore for similarity analysis,the Knuth-Morris-Pratt algorithm for lexical search, and Faiss for semanticsearch. Besides, it wraps existing efficient and widely-used implementations ofcertain frameworks and metrics, such as sacreBLEU and ROUGE, whenever it isappropriate and suitable. Overall, the library aims to provide extensivecoverage and increased flexibility in comparison to existing libraries forstrings. It can be used for many downstream applications, tasks, and problemsin natural-language processing, bioinformatics, and computational socialsciences. It is implemented in Python, easily installable via pip, andaccessible through a simple API. Source code, documentation, and tutorials areall available on our GitHub page: https://github.com/stanfordnlp/string2string.