Thesis support Dimitris Theodosakis
The public thesis support of M.Sc. Student Dimitris Theodosakis will take place in classroom Α56 on Thursday 9 July , at 16:45pm.
The title and summary of Mr. Theodosakis thesis support is given below.
Query Optimization Techniques in Cloud Computing
In the area of database systems, query optimization is a vital process since it aims to find execution plans that satisfy specific requirements inside a search space which is usually very large. It can be likened to a chess game since sometimes decisions must be taken that cost in the short term but then are proved particularly pertinent, since they are part of a good overall strategy.
Because of the ever increasing volume of data that needs to be processed by a wide variety of applications, the use of distributed database systems in a cloud computing infrastructure and therefore the use of a query optimizer are necessary.
In a cloud computing environment apart from the query execution speed, we are equally interested in the cost of rented resources (virtual machines) that are being used. Thus, the plan chosen will have to meet not only the requirements in execution time but also in money paid per time quantum of using the computing resources.
In this thesis, it is presented a query optimizer for the Exareme system, which models its behavior in a high level, takes as input a plan of an algebrized SQL query and finds a satisfying execution plan which is subsequently passed to the underlying system for execution.
As part of this work, there was implemented a series of optimization techniques for various query scenarios that the Exareme is able to answer. More specifically, for a query we emphasize on approaching the Pareto skyline that arises in the two-dimensional search space (time-money). Moreover, it is presented a statistical method for finding the most efficient degree of parallelism for the execution of a certain user defined function (UDF). Finally, it is presented a technique which aims to build indices in advance for future use during query execution that effectively leverages the potential idle time of virtual machines.