Many of the most popular Internet applications today, such as Dropbox, Netflix and Instagram, are using the cloud computing paradigm to offer online services to users worldwide. Cloud computing relies on data centers which are equipped with thousands of server machines to collectively handle time-varying application requests from millions of users. In particular, cloud computing offers unprecedented scalability opportunities where an application can elastically, for example, scale out to use additional machines as user application demands increase. At the same time, cloud computing enables the deployment of applications-on-a-demand basis as application developers use cloud resources to deploy their applications. One way is known as Infrastructure as a Service (IaaS), which supplies customers with one (or more) virtual machines running on a server, that can simulate many different virtual computers. Examples of such IaaS clouds include Google’s Compute Engine and Amazon’s Elastic Compute Cloud.
The enabling technology to cloud computing is virtualization (see, for example, XEN and VMware ESX) where a physical machine is transformed into one (or more) virtual machines (VMs) each capable of operating independently as a standalone physical server. Furthermore, VM live migration enables data center managers to move VMs across physical servers with only minimal application disruption. In total, virtualization enables (a) application consolidation when multiple virtualized applications run on a single shared server; and (b) an application to span multiple physical resources across the data center. As a result, server loads can be aggregated in fewer number of servers and save energy by switching off idle servers. This technique is developed for efficient use of computer server resources in order to reduce the total number of servers required for a number of software implementations, since more resources are used than necessary to provide the functionality required.
Consolidated applications share server resources, such as CPU time, memory and disk space. For each application to operate efficiently it is required that it is allocated with enough resources from the hosting server resources in order to meet its performance requirements, as measured, for example, by the requests mean Response Time (mRT). However, adjusting the shares of resources of consolidated applications while their demands change over time is challenging.