Scalable Query Optimization for Efficient Data Processing Using MapReduce
Document Type
Conference Proceeding
Publication Date
8-17-2015
Abstract
MapReduce is widely acknowledged by both industry and academia as an effective programming model for query processing on big data. It is crucial to design an optimizer which finds the most efficient way to execute an SQL query using MapReduce. However, existing work in parallel query processing either falls short of optimizing an SQL query using MapReduce or the time complexity of the optimizer it uses is exponential. Also, industry solutions such as HIVE, and YSmart do not optimize the join sequence of an SQL query and cannot guarantee an optimal execution plan. In this paper, we propose a scalable optimizer for SQL queries using MapReduce, named SOSQL. Experiments performed on Google Cloud Platform confirmed the scalability and efficiency of SOSQL over existing work.
Identifier
84959543413 (Scopus)
ISBN
[9781467372787]
Publication Title
Proceedings 2015 IEEE International Congress on Big Data Bigdata Congress 2015
External Full Text Location
https://doi.org/10.1109/BigDataCongress.2015.100
First Page
649
Last Page
652
Recommended Citation
Shan, Yi and Chen, Yi, "Scalable Query Optimization for Efficient Data Processing Using MapReduce" (2015). Faculty Publications. 6838.
https://digitalcommons.njit.edu/fac_pubs/6838
