Elastic Data Processing Flows on the Cloud (PhD Proposal)
University:University of Athens
The PhD candidate Herald Kllapi will publicly present the proposal of his thesis in the presence of his three-member committee: Yannis Ioannidis (Thesis Supervisor), Dimitris Achlioptas, Alex Delis.
Elasticity, in principle, is a good idea as it allows better utilization of resources, both for customers (money) and for the service (better utilization). Finding tradeoffs between completion time and monetary cost is essential in a Cloud environment. Cloud enabled data processing platforms should offer the ability to select the best tradeoff. The obvious questions are: 1. Does it exist? 2. Can it be had at an overhead that makes it worth it? Our first contribution is to demonstrate that very significant elasticity exists in a number of common tasks, even when the abstraction for the cloud-computation is modeled at a very high level, such as MapReduce. Moreover, we show that elasticity can be discovered in practice using highly scalable and efficient algorithms and that there appear to be certain simple “rules of thumb” for when elasticity is present. It is natural to expect that more refined models of the cloud-computation would allow further optimizations and extraction of elasticity. At the same time, it is also very reasonable to be concerned as to whether the resulting complexity of the refined model would allow for these optimizations/extraction to be performed. Our second contribution is to demonstrate that there exists a very fertile middle- ground in terms of abstraction which enables the extraction of much more elasticity than what is possible under MapReduce while remaining computationally tractable.