We propose an iterative (IR) and simulated-annealing (SA) based methodology for leakage power minimization by the means of gate sizing and threshold voltage assignment.
1 of 42
Download to read offline
More Related Content
Leakage Power Minimization using SA-Based Gate Sizing and Threshold Voltage Assignment
2. Outline
? Introduction
? Related Work
? Problem Formulation
? Proposed Methodology
? Experimental Results
? Conclusion and Future Work
2
3. Introduction
? Low Power and High Performance
? Mobile device
? Leakage Power Rise
? ITRS Roadmap 2009 [33]
? Technology scales down
3
4. Leakage Power Minimization Methods
? Gate Sizing
???? ???? 『
??????? ????? 『
??????? ????????
? Threshold Voltage Assignment
? ??? 『 1/??????? ?????
? ??? 『 ????? ????
? Low Vth on critical path
? High Vth on non-critical path
4
5. Outline
? Introduction
? Related Work
? Problem Formulation
? Proposed Methodology
? Experimental Results
? Conclusion and Future Work
5
6. Related Work
6
Continuous methods Discrete methods
? Linear Programming (LP)
? Geometric programming
(GP)
? Sensitivity-based Approach
? Slack and delay Budgeting
? Dynamic Programming(DP)
? Lagrangian Relaxation (LR)
? Linear Programming (LP)
? Simulated Annealing (SA)
7. Continuous Methods
? Linear Programming (LP)
? Linear delay model
? The selection of gates is defined as linear function
? Geometric programming (GP)
? Polynomial delay model
7
8. Discrete Methods
? Sensitivity-based approach
? Score and Rank gates according to a defined sensitivity
? Iteratively select the best gate for optimization until no improvement can be
made
? Slack and delay budgeting
? Allocate a slack budget to each gate
? Use the slack budget to trade the power for each gate.
? Dynamic Programming (DP)
? Use decision stage and cost-to-go function.
8
9. Discrete Methods (cont.)
? Lagrangian Relaxation (LR)
? Covert constrained problem to unconstrained one.
? Lagrange multiplier
? Linear Programming (LP)
? The selection of gates is implemented by assigning value to a binary variable:
1 is chosen and 0 otherwise.
? Simulated Annealing (SA)
? Probabilistic method for finding a good approximation to the global optimum
9
10. Related Work Comparison
Methodology Pros Cons
Continuous
Sizing
LP
Fast
Modeling Error
Mapping IssueGP
Discrete
Sizing
Sensitivity Local optimal
Slack & Delay
Ignore delay interaction
LP
DP Solution space explosion
LR Large scale Solution Oscillate
SA
Global optimal
Approximation
Fast solution space
exploration
10
11. Outline
? Introduction
? Related Work
? Problem Formulation
? Proposed Methodology
? Experimental Results
? Conclusion and Future Work
11
12. Motivational Example
12
Solution u1 u2 u3
Timing
Violation
Total
Leakage
Power
Solution 1 s10 s06 s04 -2.32 26
Solution 2 s10 s06 f04 0 86
Solution 3 s10 s06 m04 0 38
n2n1
oa oa oa
n3 n4
50ps
u1 u2 u3
13. Problem Formulation
? Inputs:
? Standard Cell Library
? Gate-level Netlist
? Timing Constraints
? Interconnect Parasitics
? Outputs:
? The selection of each cell¨s sizes and threshold voltage
? Objective:
? Satisfy all performance constraints
? Minimize total leakage power
13
14. Performance constraints
? Slack violation:
? At PO and DFF inputs, it exists negative slack.
? Slew(Transition time) violation:
? At PO and cell input pins, the transition time is larger than the max limit
transition time.
? Max-load violation:
? At cell output pins, the fan-out load summation is larger than the cell¨s max
capacitance.
14
15. Problem Assumptions
? Interconnect parasitics are modeled as lumped capacitance.
? Sequential sizing is not allowed.
? Only one selection for sequential cells.
? Ideal clock network
? No clock buffer, zero skew, and clock net has zero lumped capacitance.
15
16. Outline
? Introduction
? Related Work
? Problem Formulation
? Proposed Methodology
? Experimental Results
? Conclusion and Future Work
16
17. Proposed Methodology
? Phase I: Iterative Algorithm for Initial Solution
? Initial solution that satisfies the timing requirement
? Phase II: Simulated-Annealing-Based Algorithm
? Leakage power minimization
17
18. Phase I: Pseudo Code
Iterative Algorithm: upsize cells for feasible solution
Inputs: netlist, cell library, timing constraints, and interconnect parasitics
Outputs: each cell¨s size and threshold voltage assignment
Step 1: Count the visited times of the cells traced by negative-slack paths
Step 2: Sort by each cell counter
Step 3: Iterative upsizing in above-defined order
18
19. Phase I: Pseudo Code (Step 1)
Step 1: Count the visited times of the cells traced by negative-slack paths
Run timing engine to calculate each cell¨s slack;
Initialize each cell¨s counter to zero;
Initialize each cell¨s to smallest type-size;
foreach (negative-slack paths)
foreach (cells in the selected path)
if (selected cell has negative slack)
Increase selected cell¨s counter;
19
20. Phase I: Pseudo Code (Step 2 & 3)
Step 2: Sort by each cell counter
Sort cell order by each cell¨s counter, from larger to small;
Step 3: Iterative upsizing in above-defined order
do
foreach (cell from above-defined order)
if (selected cell has negative slack)
while (selected cell has larger type-size)
if (new Pleakage < old Pleakage)
Update type-size;
until (no negative slack)
20
21. Phase II: Simulated-Annealing-Based
1. Solution representation:
? The set of size and type of each cell.
2. Solution perturbation:
? Randomly pick a cell and change its size and threshold voltage assignment.
3. Cost function:
? Total leakage power.
4. Annealing schedule: (next slide)
21
22. Phase II: SA ! Temperature check
22
IF T > ε
THEN NEXT_ITER
ELSE
THEN FINISHED
FINISHED
START
initialization
T > ε
Find new solution
accept?
Update current solution
Update temperature(T)
update
T?
Yes
No
Yes
Yes
No
No
23. Phase II: SA ! New solution
23
1. Randomly pick cell
2. Randomly pick new type
and size
3. Call timer and Recalculate
cost
FINISHED
START
initialization
T > ε
Find new solution
accept?
Update current solution
Update temperature(T)
update
T?
Yes
No
Yes
Yes
No
No
24. Phase II: SA ! Solution acceptance
24
IF Cnew < Clast
IF Cnew < Cbest
THEN state = UPD
ELSE state = NEW
ELSE IF A.Prob. > Random
THEN state = ACP
ELSE state = REJ
? ?0,1expProb.Accept. *TK
C
??
??
old
oldnew
C
CC
C
)( ?
??
? ?1,0Random?
FINISHED
START
initialization
T > ε
Find new solution
accept?
Update current solution
Update temperature(T)
update
T?
Yes
No
Yes
Yes
No
No
25. Phase II: SA ! Solution update
25
FINISHED
START
initialization
T > ε
Find new solution
accept?
Update current solution
Update temperature(T)
update
T?
Yes
No
Yes
Yes
No
No
IF state = UPD or NEW or
ACP
THEN Slast = Snew
ELSE
THEN Slast = Slast
26. Phase II: SA ! Temperature update
26
IF γ > φ
THEN DROP_TEMP
ELSE
THEN NEXT_ITER
γ is the counter of successive
state ^Reject ̄
φ is a constant variable
FINISHED
START
initialization
T > ε
Find new solution
accept?
Update current solution
Update temperature(T)
update
T?
Yes
No
Yes
Yes
No
No
27. Outline
? Introduction
? Related Work
? Problem Formulation
? Proposed Methodology
? Experimental Results
? Conclusion and Future Work
27
28. Experimental Results
? Experimental Setting
? Standard Library
? Timing Engine
? Acceptance Probability
? Benchmark
? The Trend of Leakage Power Minimization
? Cost Comparison
28
29. Standard Library
? Cell Library in Synopsys Liberty format
? Combinational cells:
? 11 Footprints:
? in01, na02, na03, na04, no02, no03, no04, ao12, ao22, oa12 and oa22
? Each cell has 30 options
? 3 threshold voltage type and 10 gate size
? Sequential cells:
? 1 Footprints: ms08
29
30. Power, Capacitance, & Delay LUBs
30
Footprint:
in01
Leakage Power
(uW)
Capacitance
(fF)
Delay Time
(ps)
Vt Type
Gate Size
s m f s m f s m f
1 1 4 16 12.8 14.4 16 11.7 10.7 9.1
3 3 12 48 38.4 43.2 48 8.2 7.2 6.5
4 4 16 64 51.2 57.6 64 6.5 5.7 5.2
6 6 24 96 76.8 86.4 96 6.5 5.7 5.2
8 8 32 128 102.4 115.2 128 6.5 5.7 5.2
37. Cost Comparison
37
)
35
#
(*15
K
gates
RounduphhRuntime ??
3.71E+05
1.54E+06
2.05E+05
1.58E+05
1.47E+05
2.15E+05
4.51E+05
3.68E+05
0.E+00 5.E+05 1.E+06 2.E+06 2.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
DMA
3.51E+05
1.71E+06
2.03E+05
1.15E+05
1.16E+05
6.96E+05
2.26E+05
2.88E+05
0.E+00 5.E+05 1.E+06 2.E+06 2.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
pci_bridge32
1.54E+06
4.15E+06
6.74E+05
8.84E+05
6.97E+05
9.47E+05
2.28E+06
1.13E+06
0.E+00 2.E+06 4.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
des_perf
4.00E+05
1.47E+06
4.15E+05
3.78E+05
3.91E+05
4.63E+05
6.44E+05
7.53E+05
0.E+00 5.E+05 1.E+06 2.E+06 2.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
vga_lcd
◎ 73%
38. Cost Comparison (cont.)
38
7.32E+05
1.34E+06
6.27E+05
6.14E+05
7.36E+05
7.58E+05
8.62E+05
5.02E+06
0.E+00 2.E+06 4.E+06 6.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
b19
3.90E+06
4.78E+06
1.77E+06
1.97E+06
1.94E+06
1.81E+06
2.10E+06
2.00E+06
0.E+00 2.E+06 4.E+06 6.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
netcard
2.28E+06
5.40E+06
1.42E+06
1.79E+06
2.96E+06
1.47E+06
1.88E+06
1.92E+06
0.E+00 2.E+06 4.E+06 6.E+06
IR+SA
IR
NTUgs
UFRGS-BRAZIL
PowerValve
Goldilocks
eOPT
CUsizer
Total Leakage Power (μWatt)
leon3mp
39. Outline
? Introduction
? Related Work
? Problem Formulation
? Proposed Methodology
? Experimental Results
? Conclusion and Future Work
39
40. Conclusion
? An iterative algorithm is the necessary to initialization. Without using
it, the SA approach may not converge in fixed runtime.
? Our approach can reach a feasible solution in the same magnitude of
related works in all benchmarks.
? In some cases, our approach is resulted in a better solution than
previous work and reduce more than 70 % leakage power from initial
solution in sharp time.
40
41. Future Work
? Much realistic RC network model
? The leakage power minimization of the sequential circuit
41
#7: Gate sizing and threshold voltage assignmnet 琲P議冩梢90定旗_兵厮幟u鞭欺嶷咀緩υ擴四議朕剖侭戻竃議光N圭隈遇麼勣辛蛍百齊蛍e蛆黯議continuous methods才嘔議 discrete methods。
Continuous method 麼勣嗤 linear programming 才 geometric programming 彪N圭隈。
Discrete methods t嗤 sensitivity-based approach, slack and delay budgeting 吉鎗N圭隈。
參和厘玉初B光N圭隈K拝壓緩何蛍議恷瘁恂匯弌Y。
#8: 壓continuous method 嶄 linear programming power model才 gate selection協x格來痕機
遇 geometric programming M匯化 power model 協x撹 謹塀痕機
Modeling Error: misleads optimization due to the inaccuracy of delay and power models.
Mapping Issue: makes no guarantee on mapping a continuous solution to a discrete one.
#10: LR 頁 constrained problem DQ撹 unconstrained problem 肇箔larangian multiplier議盾
LP 嗤e豢continuous method size and threshold voltage議x喘個撹匯binary variable躅榁rounding}
SA t頁 壓曜諮議^殻嶄嗤l周仇俊鞭^餓議盾參箔俊除恷煮盾議盾遇拝m喘壓x柊議盾腎g(discrete larger search space)