Shared Execution of Recurring Workloads in MapReduce
Chuan Lei Zhongfang Zhuang Elke A. Rundensteiner and Mohamed Eltabakh
In this work, we propose the first scalable multi-query sharing engine tailored for recurring workloads in the MapReduce infrastructure, called “Helix”. Helix deploys new sliced window-alignment techniques to create sharing opportunities among recurring queries without introducing additional I/O overheads or unnecessary data scans.