A Survey of Data Quality Measurement and Monitoring Tools

  • 2019-07-18 16:19:52
  • Lisa Ehrlinger, Elisa Rusz, Wolfram Wöß
  • 4

Abstract

High-quality data is key to interpretable and trustworthy data analytics andthe basis for meaningful data-driven decisions. In practical scenarios, dataquality is typically associated with data preprocessing, profiling, andcleansing for subsequent tasks like data integration or data analytics.However, from a scientific perspective, a lot of research has been publishedabout the measurement (i.e., the detection) of data quality issues anddifferent generally applicable data quality dimensions and metrics have beendiscussed. In this work, we close the gap between research into data qualitymeasurement and practical implementations by investigating the functional scopeof current data quality tools. With a systematic search, we identified 667software tools dedicated to "data quality", from which we evaluated 13 toolswith respect to three functionality areas: (1) data profiling, (2) data qualitymeasurement in terms of metrics, and (3) continuous data quality monitoring. Weselected the evaluated tools with regard to pre-defined exclusion criteria toensure that they are domain-independent, provide the investigated functions,and are evaluable freely or as trial. This survey aims at a comprehensiveoverview on state-of-the-art data quality tools and reveals potential for theirfunctional enhancement. Additionally, the results allow a critical discussionon concepts, which are widely accepted in research, but hardly implemented inany tool observed, for example, generally applicable data quality metrics.

 

Quick Read (beta)

loading the full paper ...