Date of Award

Fall 2014

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computing Sciences - (Ph.D.)

Department

Computer Science

First Advisor

Andrew Sohn

Second Advisor

Alexandros V. Gerbessiotis

Third Advisor

Cristian Borcea

Fourth Advisor

Durgamadhab Misra

Fifth Advisor

James A. McHugh

Abstract

Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them.

At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons.

Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted.

First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%.

Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads.

Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs.

A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience.

Share

COinS