Sampling with Incremental MapReduce
Marc Schafer Johannes Schildgen Stefan Deloch
The goal of this paper is to increase the computation speed of MapReduce jobs by reducing the accuracy of the result. Often, the timely processing is more important than the precision of the result. Hadoop has no built-in functionality for such an approximation technique, so the user has to implement sampling techniques manually.We introduce an automatic system for computing arithmetic approximations.