Hwanju Kim, Hyeontaek Lim, Jinkyu Jeong, Heeseung Jo, and Joonwon Lee, ¡°Task-aware Virtual Machine Scheduling for I/O Performance¡±, ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), Washington, DC, USA, Mar. 2009.
1 of 74
More Related Content
Task-aware Virtual Machine Scheduling for I/O Performance
1. TASK-AWARE VIRTUAL MACHINE
SCHEDULING FOR I/O PERFORMANCE
Hwanju Kim, Hyeontaek Lim, Jinkyu Jeong, Heeseung Jo
(Korea Advanced Institute of Science and Technology)
Joonwon Lee (Sungkyunkwan Univ.)
VEE 2009 March 13
2. Virtual Machine
Consolidation
Centralized various computing environments
Virtual desktop infrastructure
VMware, Sun, HP, MS
Cloud computing
Amazon EC2
3. Virtual Machine
Consolidation
Centralized various computing environments
Virtual desktop infrastructure
VMware, Sun, HP, MS
Cloud computing
Amazon EC2
Unpredictable workloads
due to the diversity
5. Virtual Machine
Consolidation
Performance enhancement
Paravirtualization
Hardware-assisted techniques
Intel VT, AMD SVM
Optimization
High degree of consolidation
6. Virtual Machine
Consolidation
Performance enhancement
Paravirtualization
Hardware-assisted techniques
Intel VT, AMD SVM
Optimization Unpredictable workloads
Intelligent CPU
High degree of consolidation management can improve
the performance
7. Background
A semantic gap between the VMM and a guest OS
VMM¡¯s lack of knowledge of VM internal
No tracking characteristics of guest-level tasks
Internal workload-agnostic scheduling
Poor decision about ¡°when¡± to schedule a VM
Simple design of the VMM
OS awareness
Low overheads
E?cient resource management
Low TCB
8. Background
Task-unawareness leading to poor responsiveness
Head Run queue sorted based on CPU fairness Tail
I/O-
I/O-
bound
VM VM task
bound
task
(VCPU) (VCPU) CPU-
bound mixed
task task
VMM
9. Background
Task-unawareness leading to poor responsiveness
Head Run queue sorted based on CPU fairness Tail
I/O-
I/O-
bound
VM VM task
bound
task
(VCPU) (VCPU) CPU-
bound mixed
task task
VMM
I/O event
10. Background
Task-unawareness leading to poor responsiveness
That event
is mine!
Head Run queue sorted based on CPU fairness Tail
I/O-
I/O-
bound
VM VM task
bound
task
(VCPU) (VCPU) CPU-
bound mixed
task task
VMM
I/O event
11. Background
Task-unawareness leading to poor responsiveness
That event
is mine!
Head Run queue sorted based on CPU fairness Tail
I/O-
I/O-
I have no idea this incoming bound
VM VM event is yours task
bound
task
(VCPU) (VCPU) and CPU-
bound mixed
your VM has low priority now. task task
Sorry not to schedule you..
VMM
Responsiveness VS. Fairness
I/O event
12. Background
Task-unawareness leading to poor responsiveness
That event
is mine!
Head Run queue sorted based on CPU fairness Tail
I/O-
I/O-
I have no idea this incoming bound
VM VM event is yours task
bound
task
(VCPU) (VCPU) and CPU-
bound mixed
your VM has low priority now. task task
Sorry not to schedule you..
VMM
Responsiveness VS. Fairness
I/O event
13. Background
The worst case example for 6 domains consolidated
1 Workloads
Native Linux(I/O+CPU) 1 dom: Server & CPU-bound task
0.8 XenoLinux(I/O) 5doms: CPU-bound task
XenoLinux(I/O+CPU)
0.6
CDF
0.4
0.2
0
0 2 4 6 8 10
Response time (ms)
14. Background
The worst case example for 6 domains consolidated
1 Workloads
Native Linux(I/O+CPU) 1 dom: Server & CPU-bound task
0.8 XenoLinux(I/O) 5doms: CPU-bound task
XenoLinux(I/O+CPU)
0.6
CDF
0.4
0.2
0
0 2 4 6 8 10
Response time (ms)
15. Background
The worst case example for 6 domains consolidated
1 Workloads
Native Linux(I/O+CPU) 1 dom: Server & CPU-bound task
0.8 XenoLinux(I/O) 5doms: CPU-bound task
XenoLinux(I/O+CPU)
0.6 By boosting mechanism of
CDF
Xen Credit scheduler
0.4
0.2 Boosting mechanism realizes
I/O boundness with only VCPU-level
0
0 2 4 6 8 10
Response time (ms)
16. Main Goals
Improve responsiveness of an I/O-bound task
Priority boosting with task-level granularity
¡°Partial boosting¡±
CPU fairness guarantee
Transparency
Low management overheads
17. Issues
How to identify an I/O-bound task
How to know an incoming event is for the I/O-
bound task
18. Approach
Non-intrusive approach
No guest OS modi?cation
No explicit interface to inform I/O-bound task and event
data
Pros.
No additional engineering cost for di?erent OSes
Strong trustworthiness
Cons.
False decision
20. Tracking I/O-bound Tasks
Observable information at the VMM
Task switching
Monitoring address space changes (Antfarm USENIX¡¯06)
CPU time usage
Running time of a task
Example (x86) event
CR3 update CR3 update
) ()
c( ts
c Time
d ts rd
r
Task running time
21. Tracking I/O-bound Tasks
Inference based on common gray-box knowledge
Kernel policy to improve responsiveness of I/O-
bound tasks
An I/O-bound task is preemptively scheduled in
response to its incoming event
Characteristic of I/O-bound tasks
Short running time
Threshold to decide a short running time: IOthreashold
22. Tracking I/O-bound Tasks
Three disjoint observation
Observation
classes based on two gray-box
criteria Ambiguity
Positive evidence
Positive Negative
supports I/O-boundness evidence evidence
Negative evidence
supports non-I/O-boundness
Preemptively scheduling in response to an event
Ambiguity &
No evidence Short running time( < IOthreshold )
27. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T1
28. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T1
29. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T1
30. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T3 T1
31. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4 T5
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T3 T1
32. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4 T5
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T3 T4 T1
33. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4 T5 T6
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T3 T4 T1
34. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4 T5 T6
Time
Scheduling
VCPU1
Positive Negative Ambiguity
T2 T3 T4 T1 T5
35. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4 T5 T6
Time
Scheduling Descheduling
VCPU1 VCPU1
Positive Negative Ambiguity
T2 T3 T4 T1 T5
36. Tracking I/O-bound Tasks
Example
IOthreshold
Event pending
to VCPU1
T1 T2 T3 T4 T5 T6
Time
Scheduling Descheduling
VCPU1 VCPU1
Positive Negative Ambiguity
T2 T3 T4 T6 T1 T5
37. Tracking I/O-bound Tasks
Weighted evidence accumulation
The degree of belief to reinforce the inference
Weight of positive evidence < Weight of negative evidence
More penalize for negative evidence
BelThreshold
The degree
of At this time, this task is believed as an I/O-bound task
belief
# of sequential observations
38. HOW TO KNOW AN INCOMING
EVENT IS FOR AN I/O-BOUND TASK
40. Correlation Mechanism
- Block I/O -
Request/response style
If T1 requests for reading B1 and T1 is I/O-bound
Completion event for B1 is for I/O-bound task
How to decide ¡°T1 read B1¡± at the VMM
41. Correlation Mechanism
- Block I/O -
Request/response style
If T1 requests for reading B1 and T1 is I/O-bound
Completion event for B1 is for I/O-bound task
How to decide ¡°T1 read B1¡± at the VMM
When the VMM observes a read event, it checks
whether the current task is I/O-bound
42. Correlation Mechanism
- Block I/O -
Request/response style
If T1 requests for reading B1 and T1 is I/O-bound
Completion event for B1 is for I/O-bound task
How to decide ¡°T1 read B1¡± at the VMM
When the VMM observes a read event, it checks
whether the current task is I/O-bound
But, how about ¡°delayed read event¡± ?
Guest OS dependent (e.g. block I/O scheduler)
43. Correlation Mechanism
- Block I/O -
Inspection window
inspection window
user T1 T2 T3 T4
kernel read
VMM actual
read request
If an I/O-bound task in inspection window
The actual read request is for I/O-bound task
False positive VS. False negative
44. Correlation Mechanism
- Network I/O -
Event identi?cation
Socket-like information is too heavy for the VMM
Destination port number for TCP/IP communication
Most speci?c to a recipient task
Asynchronous packet reception
No prior information about incoming packets
History-based prediction mechanism
45. Correlation Mechanism
- Network I/O -
History-based prediction mechanism
Inference
¡°If an incoming packet is for I/O-bound task, this
packet makes the I/O-bound task to be
preemptively scheduled¡±
Monitoring the ?rst woken task in response to an
incoming packet
46. Correlation Mechanism
- Network I/O -
History-based prediction mechanism (cont¡¯)
Portmap
An entry for each destination port number
Each entry is an N-bit saturating counter
Example (2-bit counter)
00 01
portmap non- weak
I/O-bound I/O-bound
If the ?rst woken task is I/O-bound
Otherwise
10 11
strong
I/O-bound
I/O-bound
47. Correlation Mechanism
- Network I/O -
History-based prediction mechanism (cont¡¯)
Portmap
An entry for each destination port number
Each entry is an N-bit saturating counter
Example (2-bit counter)
00 01
portmap non- weak
I/O-bound I/O-bound
If the ?rst woken task is I/O-bound
Otherwise
10 11
strong If portmap counter¡¯s MSB is set, this packet
I/O-bound
I/O-bound
is for I/O-bound
49. Partial Boosting
Priority boosting with task-level granularity
Priority boosting lasts during the run of an I/O-bound task
Why?
To prevent CPU-bound tasks in a boosted VCPU from
compromising CPU fairness
50. Partial Boosting
Procedure
Head Run queue sorted based on CPU fairness Tail
VM1 VM2 VM3
(VCPU) (VCPU) I/O- CPU- CPU-
bound bound bound
task task task
VMM
51. Partial Boosting
Procedure
Head Run queue sorted based on CPU fairness Tail
VM1 VM2 VM3
(VCPU) (VCPU) I/O- CPU- CPU-
bound bound bound
task task task
VMM
If this event is inferred for an I/O-bound task in VM3,
I/O event do partial boosting for VM3
52. Partial Boosting
Procedure
Head Run queue sorted based on CPU fairness Tail
VM3 VM1 VM2
I/O- CPU- CPU- (VCPU) (VCPU)
bound bound bound
task task task
VMM
If this event is inferred for an I/O-bound task in VM3,
I/O event do partial boosting for VM3
53. Partial Boosting
Procedure
Head Run queue sorted based on CPU fairness Tail
VM3 VM1 VM2
I/O- CPU- CPU- (VCPU) (VCPU)
bound bound bound
task task task
VMM
If this event is inferred for an I/O-bound task in VM3,
I/O event do partial boosting for VM3
54. Partial Boosting
Procedure
Head Run queue sorted based on CPU fairness Tail
VM3 VM1 VM2
I/O- CPU- CPU- (VCPU) (VCPU)
bound bound bound
task task task
VMM
If this event is inferred for an I/O-bound task in VM3,
I/O event do partial boosting for VM3
55. Partial Boosting
Procedure
Head Run queue sorted based on CPU fairness Tail
VM1 VM2 VM3
(VCPU) (VCPU) I/O- CPU- CPU-
bound bound bound
task task task
VMM
If this event is inferred for an I/O-bound task in VM3,
I/O event do partial boosting for VM3
57. Implementation
Based on Credit scheduler in Xen 3.2.1
Task information maintained by hash
Limited number of tasks maintained
Remove of a task with infrequent I/O
Correlation
Block I/O : using grant table in Xen
Network I/O : supported by network backend driver
No consideration of multiple VCPUs
58. Evaluation
Interactive workload Worst case scenario
1 dom: Server & CPU-bound task
Packet request-response 5doms: CPU-bound task
Think time: 100 ~ 1000 ms
IOthreshold = 0.5 ms
59. Evaluation
Interactive workload Worst case scenario
1 dom: Server & CPU-bound task
Packet request-response 5doms: CPU-bound task
Think time: 100 ~ 1000 ms
IOthreshold = 0.5 ms
60. Evaluation
Correlation evaluation
Partial boosting hit ratio (PBHR)
h
PBHR (%) = ¡Á 100
T he number of partial boostings
where
1 , if an I/O-bound task awakes during partial boosting.
h=
0 , otherwise.
de?ned as true positive ratio
False positive ratio = (100 - PBHR) %
63. Evaluation
Correlation evaluation: Network I/O
100 90 93 Workloads
1 dom: 8 tasks
75 64 1 task: I/O-bound task
PBHR (%)
7 tasks: I/O+CPU task
50 (CPU usage 1~300ms between IOs)
5doms: CPU-bound task
25
10
0 2-bit is reasonable
1 2 4 NC in terms of space overheads
Bit-width of portmap counter
64. Evaluation
Response time for text editing during CPU-intensive
workload
1 1
0.9 Normal 0.9 Normal
TAVS TAVS
0.8 0.8
0.7 0.7
0.6 0.6
CDF
CDF
0.5 0.5
0.4 0.4
0.3 0.3
0.2 Text editing 0.2 Text editing
0.1 (Xen compilation) 0.1 (Web browsing with Flash animations)
0 0
0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.05 0.1 0.15 0.2 0.25 0.3
Response time (sec) Response time (sec)
65. Evaluation
Execution time of I/O-bound tasks with CPU-intensive
workloads
Normal TAVS
Normalized execution time
1.00
0.75
0.50
PBHR = 99%
0.25
0
grep find wget smbcp smbcp: copy ?les from
remote samba server
Background workload compilation compression
66. Evaluation
Overheads
Task tracking overheads: 0.06%
No overhead for inspecting incoming packets
Increased network throughput : decreased CPU throughput
= 48 : 1
Space overhead of N-bit portmap
N * 8KB for each VM
e.g. 2-bit portmaps for TCP and UDP: 32KB for each VM
67. Conclusions
Task-aware VM scheduling
Bridging the semantic gap in CPU management
Transparency by VMM-level inference
Gray-box technique
Low overheads
68. Future Work
Extension on multicore system
Simulation-based analysis for more intelligent
scheduling
Evaluation for more various workloads
71. Implementation
Block correlation
Shared
IDD Dom1 grant table Dom2
1
Active Dom1's Dom2's
VMM grant table recent task list recent task list
T1 T3 T2 T3 T1
1
Hardware
Machine memory