The Query Translation Landscape: a Survey

Abstract

Whereas the availability of data has seen a manyfold increase in past years,its value can be only shown if the data variety is effectively tackled ---oneof the prominent Big Data challenges. The lack of data interoperability limitsthe potential of its collective use for novel applications. Achievinginteroperability through the full transformation and integration of diversedata structures remains an ideal that is hard, if not impossible, to achieve.Instead, methods that can simultaneously interpret different types of dataavailable in different data structures and formats have been explored. On theother hand, many query languages have been designed to enable users to interactwith the data, from relational, to object-oriented, to hierarchical, to themultitude emerging NoSQL languages. Therefore, the interoperability issue couldbe solved not by enforcing physical data transformation, but by looking attechniques that are able to query heterogeneous sources using one uniformlanguage. Both industry and research communities have been keen to develop suchtechniques, which require the translation of a chosen 'universal' querylanguage to the various data model specific query languages that make theunderlying data accessible. In this article, we survey more than forty querytranslation methods and tools for popular query languages, and classify themaccording to eight criteria. In particular, we study which query language is amost suitable candidate for that 'universal' query language. Further, theresults enable us to discover the weakly addressed and unexplored translationpaths, to discover gaps and to learn lessons that can benefit future researchin the area.

Quick Read (beta)

loading the full paper ...