My dear reader, how are you? السلام عليكم
“Success isn’t always about greatness. It’s about consistency. Consistent hard work leads to success. Greatness will come.” – Dwayne Johnson
In this post, I explain the resource management problems and common resource optimization techniques applied in modern cloud-based data centers.
Cloud Computing and Resource Allocations
Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Cloud providers typically use a “pay-as-you-go” model, which can lead to unexpected operating expenses if administrators are not familiarized with cloud-pricing models. The availability of high-capacity networks, low-cost computers and storage devices as well as the widespread adoption of hardware virtualization, service-oriented architecture and autonomic and utility computing has led to growth in cloud computing [wikipedia].
The goal of cloud computing is to allow users to take benefit from all of these technologies, without the need for deep knowledge about or expertise with each one of them. The cloud aims to cut costs and helps the users focus on their core business instead of being impeded by IT obstacles. The main enabling technology for cloud computing is virtualization. Virtualization software separates a physical computing device into one or more “virtual” devices, each of which can be easily used and managed to perform computing tasks.
Virtualization provides the agility required to speed up IT operations and reduces cost by increasing infrastructure utilization. Therefore, allocating proper resources to the users to ensure the quality of service and keeping utilizing all the physical resources efficiently are the two main objectives for cloud providers.
Resource allocation approaches
There are three main approaches for resource allocation in cloud environments, i.e.,
- Static
- Dynamic
- Hybrid
In static approaches, scheduling and resource allocation decisions are made based on the description of the dependency graph of the application and on the description of the execution platform. This includes the execution times of all types of tasks on all types of resources, the communication times between any two resources and the congestion between any set of communications.
In dynamic approaches, scheduling and resource allocation decisions are made at runtime based on the state of the platform (which computing and communication resources are available) and the set of available tasks (whose dependencies have all been resolved) and the current location of input data. Dynamic strategies may be either task-driven (allocate a task to a resource as soon as a task becomes ready) or resource-driven (allocate a task to a resource as soon as a resource becomes idle). Of course, in both cases, sophisticated mechanisms in order to overlap communications with computations and to determine the relative priorities among ready tasks are added.
At last, hybrid approaches start with a static allocation strategy together with a dynamic strategy to adapt to changes in timing predictions, that may come either from bad predictions, concurrent applications running on the platform or even resource failures. Basically, hybrid strategies come altogether with an initial static mapping and a policy to cope with nondeterminism of both processing and communication times.
To summarize, virtual technology makes the difference between a traditional Data Center (DC) and a cloud one. In fact, this technology, like Xen, VMware, and Hyper-V, allows physical resources to be shared/provisioned as a virtual machine. Thus, in cloud computing, resource provisioning is a process of resource allocation (CPU, memory, disk, network) to a set of applications running on Virtual Machines (VMs).
The problems in resource allocations occur because of some inherent complexities of cloud-based systems, that are,
- Dynamism: It means that workload is not static and the problem that arises is that there can be overload or underload conditions.
- Resource and workload heterogeneity: This means application requirement may not be compatible with hardware of the machine. The problem because of this is increased risk of performance degradation
The different ways to solve resource management problems are correlated to the cloud models. Usually, cloud computing offers three models of services: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS). In this paper, we are interested in IaaS where the CP provides a virtual infrastructure and manages the utilization of Physical Machines (PMs) in order to satisfy certain objectives (energy consumption, load balancing) while avoiding the SLA violation.
The techniques to achieve optimal resource allocations include the following
- Server Consolidation: aims to consolidate running VMs on a fewer number of PMs.
- VM Placement: Placement of VM on proper PM with resouces that are required.
- Live Migration: Live migration refers to the process of moving a running virtual machine or application between different physical machines without disconnecting the client or application. Memory, storage, and network connectivity of the virtual machine are transferred from the original guest machine to the destination.
All these approaches are complementary to each other and hence are used together.
The resource optimization problem can be broken into three sub-problems:
- Detecting over or under-load nodes/hosts/PMs.
- Selecting VMs to migrate.
- Placing VMs on appropriate hosts.
Industry insights for resource optimization
Let us now survey some of the existing approaches that are being used by three mainstream cloud providers and consumers, i.e., 1) Amazon, 2) Google, 3) SAP SE.
All of these tech-giants use dynamic resource allocation for resource optimization. Following is a summary of their methodologies:
Amazon
- Application and workload is not known.
- Runtime exploitation of metrics such as CPU usage frequency, latency, and bandwidth (memory, disk, IO, network), GPU usage.
- Workloads are grouped as a set of similar applications by a resource optimization manager.
- The details are in DirectMe
- Google Borg (Resource management for Google Distributed Clusters)
- Application and workload is known.
- Uses average utilization as a metric to allocate resources to workloads.
- The details are in DirectMe
SAP SE
- Optimize resources to achieve a load-balanced solution.
- Uses a round-robin algorithm for resource allocation.
- The details are in DirectMe
I hope you find this post useful. If you find any errors or feel any need for improvement, let me know in your comments below.
Signing off for today. Stay tuned and I will see you in my next post! Happy learning.