This document discusses aging effects in processors and proposes an aging-aware compiler to help address non-uniform aging in GPU architectures. It first provides background on aging effects like NBTI and how they can cause non-uniform degradation in VLIW slots. It then proposes a dynamic binary optimizer compiler that can idle fatigued processing elements, reassign instructions, and predict performance degradation to equalize lifetime without architectural changes. Evaluation shows this approach reduces threshold voltage increases by up to 49% with no throughput penalty.
1 of 18
More Related Content
Aging-Aware Compiler-Directed VLIW Assignment for GPGPU Architectures
2. DOES YOUR PROCESSOR
AGE?
? DOES AGEING AFFECT THE PERFORMANCE?
? ARE THERE ANY EFFECTS OF AGING?
? DOES HARDWARE AGE UNIFORMLY?
3. A FEW TERMS
? VLIW (VERY LONG INSTRUCTION WORD)
? GPGPU (GENERAL PURPOSE GRAPHICAL
PROCESSING UNIT)
? OCAS (ON-CHIP AGING SENSOR)
? BTI (BIAS TEMPERATURE INSTABILITY)
4. NBTI (NEGATIVE BIAS
TEMPERATURE INSTABILITY)
? OCCURS WHEN A PMOS IS NEGATIVELY BIASED
? MANIFESTS AS AN INCREASE OF THRESHOLD
VOLTAGE
? ACCELERATED BY TEMPERATURE
? LOGARITHMIC FUNCTION OF STRESS TIME
? DISTRIBUTED NON-UNIFORMLY
5. NBTI EXPLAINED
? DISSOCIATION OF SI?H BONDS ALONG SILICON
OXIDE INTERFACE
? GENERATION OF INTERFACE TRAPS
? THRESHOLD VOLTAGE INCREASES AS MORE TRAPS
FORM
? REDUCES THE DRIVE CURRENT
? AGING CAN BE RECOVERED PARTIALLY
? INTERFACE TRAPS CAN BE REDUCED BY
ANNEALING
6. EFFECTS
? AGING OVER TIME
? VARIABILITY ACROSS MANUFACTURED PARTS
? IMPACTS LIFETIME UNCERTAINTY
? UNBALANCING
? SENSITIVE IN ANALOG BLOCKS E.G. DFT
7. EFFECTS: REAL OR NOT?
? ESTIMATED THRESHOLD INCREASE OF 5-15% PER
YEAR
? IMPACT ON CIRCUIT DELAY IS ABOUT 15 PERCENT
ON A 65NM
? WORSE IN SUB-65NM NODES
? DELAY DEGRADATION FOLLOWS THE SAME TREND
8. OUR PROBLEM
? HIGHLY CORRELATED WORKLOAD FOR GPGPU
? FREQUENT AND NON-UNIFORM EXECUTION OF
VLIW SLOTS
? CAUSES NON-UNIFORM AGING AND EXHAUSTING
? SHORTENING THEIR LIFETIME
? NO PE IS IMMUNE FROM UNBALANCED UTILIZATION
11. POSSIBLE SOLUTIONS
? SELECTIVE FREQUENCY AND SPEED SCALING
? WEAR-OUT AND DISCARD FAULTY CORES
? ONLINE ADAPTIVE VLIW REALLOCATION
STRATEGY
? SELECTIVE SHUTDOWN DISABLING SLOW CORES
? RUNNING EACH CORE AT ITS MAXIMUM
FREQUENCY INDEPENDENTLY
12. AGING AWARE COMPILER
? DYNAMIC BINARY OPTIMIZER
? IDLING A FATIGUED PE
? REASSIGNING INSTRUCTIONS
? PREDICTING PERFORMANCE DEGRADATION AND
AGING
13. RESULTS AND CONCLUSION:
? FULLY IN-PARALLEL WITH GPGPU ON A HOST CPU
? IMPOSES 0% THROUGHPUT PENALTY
? REDUCES ¦¤VTH: UP TO 49%(11%) AND ON AVERAGE
34%(6%)
? TOTAL EXECUTION TIME OF THE ADAPTION PROCESS
IS 13 MS
? EQUALIZES EXPECTED LIFETIME OF PE WITHOUT
ARCHITECTURAL MODIFICATION
17. REFERENCES
? ON-CHIP AGING SENSOR TO MONITOR NBTI EFFECT IN
NANO-SCALE SRAM BY A. CERATTI, T. COPETTI, L.
BOLZANI, F. VARGAS
? AGING-AWARE COMPILER-DIRECTED VLIW ASSIGNMENT
FOR GPGPU ARCHITECTURES BY ABBAS RAHIMI, LUCA
BENINI DEIS, RAJESH K. GUPTA CSE