The Ginga Self-Adaptive Query Processing
System is a distributed software system that supports adaptive query processing.
When executing a query in a highly unstable runtime environment, the query
processor needs to be highly adaptive in order to cope with unpredictable
runtime situations (e.g., unexpected network delays, latency fluctuations,
memory shortage). Ginga is a Brazilian
word typically used to describe a quality that a person needs to have when
dancing samba. Like the quick rhythm and
movements of samba, the query processing system equipped with Ginga Self-Adaptive
Query Processing System can quickly and efficiently change the execution
of a query plan in order to keep up with the rhythm imposed by the runtime
variations in the environment.
Project Goal
Our goal is to provide continuously efficient query
execution, taking into account the important changes to the runtime environment.
We are particularly interested in distributed queries with timing constraints
(deadlines), as well as queries that are long running or need to be executed
more than once (e.g., continual query).
Main Technical Challenges
Processing and optimizing ad-hoc and continual queries
in an open environment pose several technical challenges. First, it is well
known that optimized query execution plans constructed at compile time make
some assumptions about the environment (e.g., network speed, data sources'
availability). When such assumptions no longer hold at runtime, how to guarantee
that the query execution will be performed efficiently and meet the deadline.
Second, it is widely recognized that runtime adaptation is a complex and
difficult task in terms of cost and benefit. How can we develop an adaptation
methodology that makes the runtime adaptation beneficial with affordable
cost? Last and not the least, are there any viable performance metrics and
performance evaluation techniques for measuring the cost and validating the
benefits of runtime adaptation methods?
Ginga Approach
Ginga adaptation engine combines a proactive engagement phase, before query execution, with a reactive control phase during
the execution. Before the query starts, Ginga builds an initial optimized
plan and some alternative execution plans that may be needed due to runtime
variations in the environment. Since there are many potential alternative
plans, we organize them into an adaptation space. During query execution,
Ginga (reactive control phase) monitors the system resource availability
(e.g., network connection and bandwidth for distributed queries) through
execution progress and determines when to change the query
plan and how to adapt by choosing an alternative created in
the proactive phase.