TRIREME: Sailing through Flows of Big Data
Herald Kllapi
Lefteris Stamatogiannakis
Manolis Tsangaris
Yannis Ioannidis
Date published: 
Published In: 
HDMS, Athens, Greece, July 2014
Conference Article

In the era of Big Data, datasets are growing at a much higher rate than the processing power of a single machine. In the long run, distributed computing seems to be the only viable solution to be able to store and process the vast amount of data. Clouds have become a very attractive platform for large scale data processing due to their elastic property, i.e., additional resources can be leased for larger datasets or more complex processing. In a cloud environment, the monetary cost of using the resources is very important.

In this paper, we introduce Trireme, a system for large scale elastic data processing on the cloud. The system of- fers a declarative language based on SQL that is extended with user-defined functions (UDFs) and an inverted syntax to easily and declaratively express complex computation. Users can extend the functionality of the system by writ- ing new UDFs using a clear and simple interface. Trireme is designed to take advantage of the elasticity of clouds by oering tradeos between the running time and monetary cost of using the resources. We present the system design along with its main components, the language abstractions, and the optimization techniques that we use. Finally, we present the results of several large-scale experiments that show the eectiveness of the system.


MaDgIK 2009-2016