Some of Entity Resolution

  • 2021-10-19 16:23:31
  • Olivier Binette, Rebecca C. Steorts
  • 0

Abstract

Whether the goal is to estimate the number of people that live in acongressional district, to estimate the number of individuals that have died inan armed conflict, or to disambiguate individual authors using bibliographicdata, all these applications have a common theme -- integrating informationfrom multiple sources. Before such questions can be answered, databases must becleaned and integrated in a systematic and accurate way, commonly known asstructured entity resolution (record linkage or de-duplication). In thisarticle, we review motivational applications and seminal papers that have ledto the growth of this area. We review modern probabilistic and Bayesian methodsin statistics, computer science, machine learning, database management,economics, political science, and other disciplines that are used throughoutindustry and academia in applications such as human rights, officialstatistics, medicine, citation networks, among others. Finally, we discusscurrent research topics of practical importance.

 

Quick Read (beta)

loading the full paper ...