COMP5349 – Cloud Computing
Week 2: Virtualization Technology
Dr. Ying Zhou School of Computer Science
Outline
Virtualization Technology Overview Definition
Early Examples
Technology areas and subareas
Server Virtualization
Resource managements
Brief Intro to AWS EC2 and Related Concepts
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-2
Last Week
Cloud computing allows users to rent various resources from providers on hourly(secondly), daily, monthly, yearly base
A few general models, e.g. XaaS, are used to describe the spectrum of cloud computing
IaaS and some PaaS use virtualization technology to present a virtual machine to the user
Virtualization technology is relatively standard and mature
A related technology, container, is gaining popularity in
cloud computing
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-3
What is virtualization?
Narrow and very common perception of virtualization in cloud era
“Virtualization lets you run multiple virtual machines on a single physical machine, with each virtual machine sharing the resources of that one physical computer across multiple environments.”
Broadly, the term is used in many other context, some are closely related with each other
E.g. virtual memory, storage, network, virtual reality
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-4
Early virtualization examples I
Virtual Memory
Provide more accessible memories to CPU than those physically
presented on the board
An important component of modern operating system Address translation
Data transfer
Data placement strategy for swap in/out
Figure 1.4 in Data Center Virtualization Fundamentals
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-5
Early virtualization examples II
Mainframe Virtualization
IBM first released mainframe virtualization solution in 1972 The overall architecture looks similar to modern system VM Solve the problem of OS migration triggered by hardware release It was initialized by the time-sharing technologies in 1960s
Figure 1.5 in Data Center Virtualization Fundamentals
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-6
Early virtualization examples III
Hot Standby Router Protocol
Solve the problem of single point of failure One host can only define a default gateway
Use virtual IP and preconfigured priority to decide which physical router would act as the default gateway
Other router with the same virtual IP can take over if the default one fails
Figure 1.6 in Data Center Virtualization Fundamentals
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-7
Common features of virtualization
Emulation
A preexisiting IT resources was emulated (memory, mainframe, IP
address, etc) Transparency
The consumers of the resources (CPU, mainframe users, network hosts) cannot distinguish between real and emulated resources
Benefits
Compared with directly using actual resources, virtualization brings various benefits (memory expansion, resource optimization, high availability, etc)
A broader definition of virtualization
“Virtualization is the transparent emulation of an IT resource producing to its consumers benefits that were unavailable in its physical form.”
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-8
Virtualization Technology Areas
• • •
Data plane
Control plane Management plane
1962, University of Manchester
Figure 1.9 in Data Center Virtualization Fundamentals
1972, IBM
1957, Stanford
1999, VMware
• Hardware
• Operating System • Application
• Storage device • Host
• Interconnect
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-9
Subareas
Figure 1.10 in Data Center Virtualization Fundamentals
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-10
Outline
Virtualization Technology Overview
Server Virtualization Motivation
Hypervisor
Types and Roles
Example Hypervisors
Virtual Machine
VM vs Native OS
Managing Critical Instruction
Resource managements
Brief Intro to AWS EC2 and Related Concepts
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-11
Well-Known Utilization Problem: Server Sprawl
Server Sprawl
(large number of underutilized servers) a major problem in many IT departments
Main causes:
Requirement by vendors to run their applications in isolation
Operating system heterogeneity
Mail server may require Windows server;
a database may best run on Linux or Solaris. Mergers and other integration projects
“81 percent of CIOs were using virtualization technologies to drive consolidation, according to a recent survey by CIO” (2008 survey)
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-12
Increasing Utilization is Hard!
The problem only magnifies on a data center scale: Average CPU utilization of 5000+ servers at Google during a six- month period
Beyond Server Consolidation. Werner Vogels. ACM Queue, Jan/Feb 2008. COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-13
100% Utilization is not the Goal
Workload in the enterprise are heterogeneous
Demand is uncertain and often occurs in spikes
Operating System starts to behave unpredictable under high CPU and IO load
for pure CPU-bound environments, 70 percent seems to be achievable for highly tuned applications; for environments with mixed workloads, 40 percent is a major success, and 50 percent has become the Holy Grail.
Real world estimates of server utilization in datacenters range from 5% to 20%!
30% is seen as a very good value already… COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-14
Server Consolidation
Solution: Pool heterogeneous services together
To do so, run several virtual machines on the same shared
hardware, coordinated by a hypervisor
(also known as: virtual machine manager/monitor (VMM)) Hypervisor abstracts / hides physical computing platform
Allows to share commodity hardware without sacrificing security and performance
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-15
Server Architectures Evolution
Mainframes (1940s ~1970s)
RISC Server (1970s ~ present)
Reduced Instruction Set Computing
E.g. Sun Microsystem’s Solaris workstation, IBM AIX workstation, etc
X86 Servers(1981~ present)
Computers that inherited the basic architectures from Intel 8086
processors and its descendants (80286, 80386, 80486, etc) This is the start of Personal Computer era
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-16
Main components of x86 server
Figure 13.1 in Data Center Virtualization Fundamentals
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-17
Hypervisor
It is a piece of software responsible for the creation of the VMs, resource sharing, device management, virtual storage management, and other traditional operating system tasks.
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-18
Two types of Hypervisor
Physical Server and VMware Solutions in 2001
Figure 13.6 in Data Center Virtualization Fundamentals
Type 2 Type 1
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-19
Two Types of Hypervisor
Type I: Bare-metal (native) VM
Type II: Hosted VM
App.
Guest OS VMM
Host OS Computer Hardware
Application Guest OS
Application
Guest OS
Appli- Appli-
Hypervisor / VMM Computer Hardware
cation cations
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-20
The role of a Hypervisor
Provide an environment identical to the physical environment.
Provide that environment with minimal performance cost. Retain complete control of the system resources
Figure 2.6 in Virtualization Essentials
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-21
Managing Resources
Each VM is presented with a fraction of the resources of the physical host
E.g. A host may have 256 GB of physical memory installed in its frame, but a guest VM may believe that it has 4 GB
Hypervisor abstracts the hardware resources and controls access to it to ensure each VM get its allocated resources
All access to hardware resources goes through hypervisor Access to memory, storage, network
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-22
Example Hypervisor: VMWare ESXi
VMkernel is a POSIX-like operating system which provides certain functionality similar to that found in other operating systems, such as process creation and control, signals, file system, and process threads.
It is designed specifically to support running multiple virtual machines and provides such core functionality as: Resource scheduling, I/O stacks, Device drivers
https://microage.com/wp-content/uploads/2016/02/ESXi_architecture.pdf
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-23
Example Hypervisors: Xen
https://wiki.xen.org/wiki/Xen_Project_Software_Overview
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-24
Xen Hypervisor: Component
The Xen Hypervisor
A thin software layer runs directly on the hardware and is Responsible for managing CPU, memory, and interrupts
The Control Domain (or Domain 0) is
A specialized Virtual Machine for handling I/O and for interacting
with the other Virtual Machines.
It also exposes a control interface to the outside world, through which the system is controlled.
Guest Domains/Virtual Machines The VM allocated to users
Toolstack and Console:
Admin interface for creating, destroying and configuring guest
domains
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-25
Example Hypervisor: Hyper-V
https://docs.microsoft.com/en-us/biztalk/technical-guides/appendix-b-hyper-v-architecture-and-feature-overview
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-26
Hyper-V Components
Hypervisor
A layer of software that sits between the hardware
Its primary job is to provide isolated execution environments called partitions. The hypervisor controls and arbitrates access to the underlying hardware.
Partition
Basically the virtual machine running on hypervisor
Root Partition
Similar to Xen’s domain 0, which manages I/O and communicate
with other partitions
Child Partition All other guest VMs
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-27
Example Hypervisor: KVM
Hypervisor performs lots of traditional operating system tasks
Kernel-based Virtual Machine (KVM) development was started mid 2006 at Qumranet, which was acquired by Red Hat in 2008.
It turns Linux kernel into a hypervisor with the help of a few modules: KVM, QEMU and libvert
The development started soon after major chip manufacturers introduced hardware support for virtualization at processor level
Intel introduced VT-x in November 2005 AMD introduced AMD-V in May 2006
Both Google Cloud Platform and AWS use KMV based
virtualization
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-28
KVM architecture
https://developer.ibm.com/tutorials/l-hypervisor/
KVM turns a Linux kernel into a hypervisor
Several hardware-aware versions
When a new operating system is booted on KVM, it becomes a process of the host operating system and therefore scheduleable like any other process. But it is in executed in different mode to regular OS
QEMU (Quick Emulator)
A generic emulator that virtualizes I/O devices as well
Libvert provides a common management layer to manage virtual machines running on hypervisors
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-29
Virtual Machine
Virtual machine can be viewed as a software computer running an operating system and associated applications.
The software computer runs on top of a hypervisor on a physical server
A virtual machine is described by a set of files Emulated hardware definition file
Virtual disk data
VM BIOS
The design to describe a VM through a set of files makes it very easy to save, copy and clone VMs across physical servers.
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-30
Hardware Management in Native OS
Application’s request for hardware (disk, network, cpu, memory) need to go through OS for safety and security reason
Those measures are built into x86 architecture by having protection ring mode at processor level
Ring 3 Ring 2
Ring 1 Ring 0
User Applications
OS Kernel Computer Hardware
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-31
Challenges for Hypervisor Implementation
Each VM runs its OS (guest OS) which assumes full control of the all hardware resources
But that is not true, guest OS has only a portion of the hardware to use in virtualization
OS is designed to be the most privileged software system on a machine, now it is not
Do we need to modify OS implementation?
The extra layer may introduce performance overhead
Before: OS + App
Now: VMM + OS + APP
Relationship between guest OS and VMM should be carefully managed.
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-32
Type of Instructions
Privileged Instruction
One that traps when it is in user mode and does not trap when it is in
system mode E.g. set CPU timer
Sensitive Instruction
Instructions that interact with hardware E.g. memory address translation
Others
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-33
Efficient Full Virtualization
When all sensitive instructions are also privileged
E.g. they trap when are not executing in system mode
Full virtualization can be achieved by running guest OS in user mode and VMM in system mode
When guest OS attempts to run a sensitive instruction in user mode, it will be trapped to VMM, who is responsible for managing the hardware globally
When some sensitive instructions are not privileged
They don’t trap automatically
They are referred to as “critical instructions”
The existence of such critical instruction makes virtualizing x86 based system very hard in early days
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-34
Full virtualization with traditional ring
VMWare products
No Guest OS modification
It relies on techniques, such as binary translation to trap and virtualize the execution of certain instructions.
These instructions are discovered (statically or dynamically at runtime) and replaced with traps into the hypervisor(VMM) that are to be emulated in software.
A binary translation can incur a
large performance overhead
Mastering KVM Virtualization
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-35
Example: Xen hypervisor
The hypervisor runs in ring 0,
where the kernel normally runs.
The guest kernel is moved to ring 1.
User apps. runs in ring 3 – just like it does today – no user space changes necessary.
Ring 3 Ring 2
Ring 1 Ring 0
User Applications
Guest OS(modified) Hypervisor
Paravirtualization
Guest OS is modified to call through Xen (‘hypercalls’) for privileged instructions; Guest OS is aware that it is running on a VM to some extent
Xen validates first whether allowed then executes on behalf of the guest OS
hypercalls
Computer Hardware
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-36
Hardware-Assisted Full Virtualization
Intel and AMD independently created new processor extensions of the x86 architecture, called Intel VT- x and AMD-V respectively.
Hypervisor runs in the new mode (e.g. root mode in VT-x)
OS requests are trapped to hypervisor without binary translation or paravirtualization
OS modification is not required since certain instructions will be trapped automatically to hypervisor
KVM is a hardware assisted full virtualization
Root Mode
https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/ techpaper/VMware_paravirtualization.pdf
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-37
Outline
Virtualization Technology Overview
Server Virtualization
Resource managements CPU
Memory I/O
GPU
Brief Intro to AWS EC2 and Related Concepts
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-38
Managing CPUs for a VM
Physical CPU cores are time shared by virtual CPUs(vCPU) in server virtualization
The dynamic scheduling of vCPUs on physical CPU is managed by hypervisor following customized scheduler algorithm
E.g. credit scheduler of Xen
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-39
Multiprocessors, cores and hyperthreading
A host server may be configured with multiple processor chips, each with multiple core
The core may support hyperthreading to have two or more logic cores for one physical core
Figure 13.2 in Data Center Virtualization Fundamentals
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-40
VM with multiple vCPUs
vCPU count of VM
Means the maximum number of threads that the VM can run at any
given moment.
vCPU and physical CPU(pCPU)
A VM can run on any and all of the host CPUs over a period of time Some hypervisor may support pinning
For a virtual machine with multiple vCPUs, the hypervisor will need to schedule each vCPU on a different pCPU (logic core also count)
The maximum number of vCPU you can assign to a single VM is bound by the hardware capacity e.g. maximum number of cores
A VM with more vCPUs may have longer waiting time to get scheduled Most applications are not designed to run on multiple threads
Configure a VM with too many vCPUs may not have any benefit
VM with 2 vCPU is a very typical setting COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-41
Virtualizing Memory Management
Memory Management
OS is designed to manage the mapping of virtual memory and
physical memory through TLB and page tables
Guest OS usually gets a dedicated portion of the physical memory to ensure strong isolation
Guest OS does not know other part of the memory and should not interfere other VM’s memory
It should not even know which part of the physical memory it is using Only VMM has access to the physical memory and can manage the
mapping
There is a level of indirection here Two level mapping/translation
TLB (translation lookaside buffer) is used to speed up the translation Could be managed jointly by hardware and OS
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-42
Memory Mapping in Guest OS
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-43
Virtualizing I/O
I/O refers to all input/output devices Disk, network, printer, monitor, mouse
On the one hand there are many devices with different ways of controlling them
On the other hand, operation systems have efficient ways to deal with I/O by providing standard interface and by loading various drivers.
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-44
Disk and Network
Disk is partitioned device
Each VM can get a partition of it exposed as virtual disk External disk drivers can be mounted easily
Network adapter is shared device
The physical adapter can be time shared by each VM A virtual network adapter is presented to the VM
The request are passed to and managed by VMM
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-45
I/O pathway
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
Figure 9.1 in Virtualization Essentials
02-46
I/O pathway in Xen
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
Figure 9.2 in Virtualization Essentials
02-47
Within host communication
Hypervisor can provide two types of virtual switches External virtual switch is connected to network card Internal virtual switch only connects VMs on the same host
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
Figure 10.2 in Virtualization Essentials
02-48
Configure Network Segmentation
Figure 10.3 in Virtualization Essentials
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-49
GPU vs. CPU
CPU
Central Processing Unit
Several cores
Low latency
GPU
Graphics Processing Unit
Many cores
High throughput
Good for serial Good for parallel processing processing
Can do a handful of Can do thousands of operations at once operations at once
https://blogs.nvidia.com/blog/2009/12/16/whats-the-difference-between-a-cpu-and-a-gpu/
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-50
GPU Virtualization
Dedicated GPU per VM using Hypervisor pass through
An entire physical GPU is directly assigned to a virtual machine. GPU is accessed by corresponding drivers running in the VM. It is not shared among VMs
Share physical GPU among multiple VMs
E.g. using special vGPU management software to slice the GPU
resources among VMs.
https://docs.nvidia.com/grid/5.0/grid-vgpu-user-guide/index.html#architecture-grid-vgpu
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-51
Outline
Virtualization Technology Overview Server Virtualization
Resource managements
Brief Intro to AWS EC2 and Related Concepts
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-52
AWS EC2
Just virtual machines (instances) running on Amazon’s data centers
Related concepts
Instance Type
Amazon Machine Image (AMI) EC2 Key Pair
Security Group
Elastic IP
Elastic Block Storage (EBS) EC2 CloudWatch
EC2 Elastic Load Balancer EC2 Auto Scale
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-53
vCPU*
Instance Types
Model
t3.micro 2
2
t3.medium 2
CPU Credits/hour Mem (GiB)
Storage
EBS-Only EBS-Only EBS-Only
Network Performance (Gbps)
t3.nano 2 6 0.5
EBS-Only Up to 5
t3.small
12
24
24
1
2
4
Up to 5 Up to 5 Up to 5
T3 instances accumulate CPU credits when a workload is operating below baseline threshold. Each earned CPU credit provides the T3 instance the opportunity to burst with the performance of a full CPU core for one minute when needed.
https://aws.amazon.com/ec2/instance-types/
t3.nano t3.small
vCPU
2 2
ECU
Variable Variable
Memory (GiB)
0.5 GiB 2 GiB
Instance Storage (GB)
EBS Only EBS Only
Linux/UNIX Usage
$0.0052 per Hour $0.0208 per Hour
t3.micro 2 Variable
1 GiB EBS Only
$0.0104 per Hour
t3.medium 2 Variable
4 GiB EBS Only
$0.0416 per Hour
https://aws.amazon.com/ec2/pricing/on-demand/
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-54
AWS dedicated GPU instance
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-55
AWS shared GPU instance
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-56
Amazon Machine Image
Very similar to the root drive of a computer
Contains OS and additional software customized for various purpose
Windows server
Windows server various MS products Linux server
LAMP Web Starter
Deep Learning AMI
Container AMI
Etc.
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-57
EC2 Key Pair
Key based authentication is a more secure option than standard password authentication
Involves a pair of public and private keys
The server stores the public keys of all legitimate users, who supply
respective private keys to login to their accounts
When an EC2 instance is created, it only allows key based authentication
User needs to specify an existing key pair or create a new key pair
AWS stores only the public key and associate it with the initial account of that instance
User will be prompted to download the private key after the pair is created
The initial account (ec2-user, ubuntu, … ) is given sudo privilege
Root level access
User can enable password based authentication on their own instance
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-58
Security Group
EC2 instances are protected by a firewall, which blocks all internal and external connectivity by default. The only port opened is 22
Security group is an easy way of configuring the firewall rules
Instances within a single security group can communicate freely with each other
Group Name Rules
web_access allow HTTP access (port 8080) from anywhere allow SSH access (port 22) from anywhere
db_access allow access to MySQL(port 3306) from anywhere allow SSH access (port 22) from anywhere
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-59
Elastic IP and Elastic Block Store
Each instance by default is assigned a public IP address. it is gone when the instance is terminated.
Elastic IP is a stable IP that can be assigned to any instances associated with an AWS account
EBS is an addressable disk volume that can be attached to any running instance and used it as local disk
Similar to NFS mount command
Many instance types only supports EBS storage
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-60
Persistent and Ephemeral Resources
Persistent resources Elastic IP Address
Elastic Block Storage Volumes Elastic Load Balancers Security Groups
AMI stored in Amazon S3 Key pairs (public key)
Ephemeral resources
EC2 instance (IP, memory, and its local storage!)
Instances may be stopped or terminated
A stopped instance has its image and EBS saved in S3 and can be restarted later; a new IP address will be assigned; storage cost is incurred
A terminated instance is gone forever COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-61
Next Week
The topic for next week is “Container Technology” Readings for next week:
Dirk Merkel, Docker: Lightweight Linux Containers for Consistent Development and Deployment, Linux Journal, May 19,
2014 https://www.linuxjournal.com/content/docker-lightweight-linux- containers-consistent-development-and-deployment
Container’s Anatomy, https://www.slideshare.net/jpetazzo/anatomy- of-a-container-namespaces-cgroups-some-filesystem-magic- linuxcon
COMP5349 “Cloud Computing” – 2020 (Y. Zhou) 02-62
References
Gustavo A., A. Santana: Data Center Virtualization Fundamentals: Understanding Techniques and Designs for Highly Efficient Data Centers with Cisco Nexus, UCS, MDS, and Beyond
Chapter 1, chapter 13
Virtualization Essentials, Second Edition
Matthew Portnoy
Jim Smith (Author), Ravi Nair , Virtual Machines: Versatile Platforms for
Systems and Processes, Morgan Kaufmann; 1 edition (June 17, 2005) Chapter 1
Chapter 8 System Virtual Machines
Xen and the Art of Virtualization. Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. In Proceedings of the Nineteenth ACM Symposium on Operating systems principles (SOSP ’03), 2003.
COMP5349 “Cloud Computing” – 2020 (Y. Zhou)
02-63