Computer Power Management Rules Ø Jim Kardach, re-red chief power architect, Intel h6p://www.youtube.com/watch?v=cz6akewb0ps 1
HW1 Ø Has been posted on the online schedule. Ø Due on March 3 rd, 1pm. Ø Submit in class. Ø Hard deadline: no homework accepted arer deadline. Ø No collabora-on is allowed. Chenyang Lu CSE 467S
The Power Problem Ø Processors improve performance at the cost of power. Ø Solu-on Performance/wa6 remains low. Hardware offer mechanisms for saving power. SoRware executes power management policies. 3
Power vs. Energy Ø Power: Energy consumed per unit -me 1 wa6 = 1 joule/second Ø Power à heat Ø Energy à ba6ery life 4
Why worry about energy? Intel vs. Duracell 16x 14x Processor (MIPS) Improvement (compared to year 0) 12x 10x 8x Hard Disk (capacity) 6x Memory (capacity) 4x 2x 1x Battery (energy stored) 0 1 2 3 4 5 6 Time (years) Ø No Moore s Law in ba6eries: 2-3%/year growth.
Trend in Power Density Sun s Surface 1000 Rocket Nozzle Nuclear Reactor Watts/cm 2 100 10 1 Hot plate i386 i486 Pentium Pentium 4 Pentium III Pentium II Pentium Pro 1.5µ 1µ 0.7µ 0.5µ 0.35µ 0.25µ 0.18µ 0.13µ 0.1µ 0.07µ Process New Microarchitecture Challenges in the Coming Genera-ons of CMOS Process Technologies, Fred Pollack, Intel Corp. Micro, 1999. 6
Trend in Cooling SoluDon 7
Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis-c approach 8
CMOS Power ConsumpDon Ø Voltage drops: power consump-on V 2. Ø Toggling: more ac-vity à higher power. Ø Leakage when inac-ve. 9
Power- Saving Features Voltage drops " Reduce power supply voltage. Toggling " Run at lower clock frequency. " Reduce ac-vity. " Disable func-on units when not in use. Leakage " Disconnect parts from power supply when not in use. 10
Dynamic Voltage Scaling Ø Why voltage scaling? Power V 2 à reduce power supply voltage saves energy. Lower voltage à lower clock frequency. Tradeoff between performance vs. energy. Ø Why dynamic? Peak compu-ng demand is much higher than average. Ø Changing voltage takes -me to stabilize power supply and clock 11
Examples Ø StrongARM SA- 1100 takes two supplies VDD is main 3.3V supply. VDDX is 1.5V. Ø AMD K6-2+ 8 frequencies: 200-600 MHz. Voltage: 1.4, 2.0 V. Transi-on -me: 0.4 ms for voltage change. Ø PowerPC 603 Can shut down unused execu-on units. Cache organized into subarrays to reduce ac-ve circuitry. 12
Intel SpeedStep Intel Core 2 Duo E6600 Intel Pen-um M P states 13
Linux DVFS Governors Ø Performance Always set at the max frequency Ø Powersave Always set at the lowest frequency Ø Ondemand Automa-cally adjust the frequency according to CPU usage Ø Conserva-ve Ø Userspace Like ondemand, but in a more conserva-ve way. Set at a fixed frequency by the user 14
Ondemand Ø Ini-al implementa-on in 2.6.9 Ø For all CPUs if (> 80% busy) then P0 (max frequency) if (< 20% busy) then down by 20% Ø Mul-ple improvements since 2.6.9 15
Get & Set CPU Frequency Ø Ø Ø Get the current frequency: /sys/devices/system/cpu/cpu[x]/cpufreq/scaling_cur_freq Example: 2400000 (2.4GHz) Frequency & governors available: /sys/devices/system/cpu/cpu[x]/cpufreq/scaling_available_frequencies Example: 2400000 2133000 1867000 1600000 /sys/devices/system/cpu/cpu[x]/cpufreq/scaling_available_governor Example: ondemand userspace performance powersave conserva-ve Set the frequency: Root privilege echo userspace > /sys/devices/system/cpu/cpu[x]/cpufreq/ scaling_governor echo 2133000 > /sys/devices/system/cpu/cpu[x]/cpufreq/scaling_setspeed 16
Clock GaDng Ø Applicable to clocked digital components Processors, controllers, memories Ø Stop clock à stop signal propaga-on in circuits Short transi-on -me Clock genera-on is not stopped Only clock distribu-on is stopped Rela-vely high power consump-on Clock itself s-ll consumes energy Cannot prevent power leaking 17
Supply Shutdown Ø Disconnect parts from power supply when not in use. General Save most power Long transi-on -me 18
Example: SA- 1100 Three power modes: Ø Run: normal opera-on. Ø Idle: stops CPU clock, w. I/O logic s-ll powered. Ø Sleep: shuts off most of chip ac-vity 19
SA- 1100 SLEEP Ø RUN à SLEEP (30 µs) Flush to memory CPU states (registers) (30 µs) Reset processor state and wakeup event (30 µs) Shut down clock Ø SLEEP à RUN (10 ms) Ramp up power supply (150 ms) Stabilize clock (negligible) CPU boot 20
Intel Core Duo Processor SV C0 Name Vcc Watt High Frequencey Mode (P0) 1.3 31 C0 Low Frequency Mode (Pn( Pn) 1.0 C1 C1E C2 C2E C3 C3E C4 DC4 Auto Halt Stop Grant (HFM) 15.8 Enhanced Halt (LFM) 4.8 Stop Clock (HFM) 15.5 Enhanced Stop Clock (LFM) 4.7 Deep Sleep (HFM) 10.5 Enhanced Deep Sleep (LFM) 3.4 Intel Deeper Sleep 0.85 2.2 Intel Enhanced Deeper Sleep 0.80 1.8 ttawa Linux *Intel Symposium Core Duo Processor 65nm Process Datasheet 21
The Mote Revolution: Low Power Wireless Sensor Network Devices, Joseph Polastre, Robert Szewczyk, Cory Sharp, David Culler, Hot Chips 16. 22
Power ConsumpDon Computer with Wireless NIC
Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis-c approach 24
Approaches Ø Sta-c Power Management Does not depend on ac-vity. Example: user- ac-vated power- down. Ø Dynamic Power Management Adapt to ac-vity at run -me. Example: automa-cally disabling func-on units. 25
Dynamic Power Management Ø Inherent tradeoff: energy vs. performance Ø Fundamental premises Non- uniform workload during opera-on Possible to predict workload with some degree of accuracy 26
PowerPC 603 AcDvity Percentage of -me idle for SPEC integer/floa-ng- point: unit Specint92 Specfp92 D cache 29% 28% I cache 29% 17% load/store 35% 17% fixed- point 38% 76% floa-ng- point 99% 30% system register 89% 97% 27
Problem FormulaDons Ø Minimize energy under performance constraints Real- -me applica-ons Ø Op-mize performance under energy/power constraints Ba6ery life-me (energy) Temperature (power) 28
Power Down/Up Cost Ø Going into/out of an inac-ve mode costs -me energy Ø Must determine if going into an inac-ve mode is worthwhile. Ø Model power states with a Power State Machine (PSM) 29
SA- 1100 Power State Machine P ON = 400 mw 10 µs 10 µs idle P OFF = 50 mw run 90 µs 90 µs 160 ms sleep P OFF = 0.16 mw P TR = P ON 30
Greedy Policy Ø Immediately goes to sleep when system becomes idle Ø Works when transi-on -me is negligible Ex. between IDLE and RUN in SA- 1100 Ø Doesn t work when transi-on -me is long! Ex. between SLEEP and RUN/IDLE in SA- 1100 Need be6er solu-ons! 31
Break- Even Time T BE Ø Minimum idle -me required to compensate for the cost of entering an inac-ve state. Ø Enter an inac-ve state is beneficial only if idle -me > T BE. 32
Break- Even Time P TR P ON Ø P TR : Power consump-on during transi-on Ø P ON : Power consump-on when ac-ve Ø T BE of an inac-ve state is the total -me it takes to enter and leave the state Ø T BE = T TR = T ON,OFF + T OFF,ON T BE = 160 ms + 90 µs for SLEEP in SA- 1100 33
SA- 1100 Power State Machine P ON = 400 mw 10 µs 10 µs idle P OFF = 50 mw run 90 µs 90 µs 160 ms sleep P OFF = 0.16 mw P TR = P ON 34
Break- Even Time P TR > P ON Ø T BE must include addi-onal inac-ve -me to compensate for extra power consump-on during transi-on. T BE = T TR + T TR (P TR - P ON )/(P ON - P OFF ) Ø Reduce T BE à save more energy Shorter T TR Higher power difference between P ON P OFF Lower P TR 35
Inherent Exploitability Ø Achievable energy saving depends on workload! Distribu-on of idle periods Ø Given an idle period T idle > T BE E S (T idle ) = (T idle - T TR )(P ON - P OFF ) + T TR (P ON P TR ) Ø Assump-ons No performance penalty. Ideal manager with knowledge of workload in advance. 36
Inherent Exploitability based on real workload 37
Time- Power Product Workload- independent Metric C S = T BE P OFF Ø An inac-ve state with lower C S may save more energy Ø Only a crude es-mate May not be representa-ve of real power savings 38
PredicDve Techniques Ø Interested event: p = {T idle > T BE } Predict based on history Ø Observed event: o Triggers state transi-on Ø Objec-ve: predict p based on o 39
Metrics Ø Safety: condi-onal probability Prob(p o) If an observed event happens à the probability of T idle >T BE Ideally, safety = 1. Ø Efficiency: Prob(o p) If T idle > T BE à the probability of correctly predic-ng. Ø Overpredic-on à high performance penalty à poor safety Ø Underpredic-on à wastes energy à poor efficiency 40
Fixed Timeout Policy Ø Enter inac-ve state when system has been idle for T TO o: T idle > T TO Ø Wake up in response to ac-vity Ø Hypothesis: If system has been idle for T TO à it will con-nue to be idle for T idle - T TO > T BE 41
T TO??? Ø Increasing T TO improves safety, but reduces efficiency. Ø Highly workload dependent Ø Karlin s result: T TO = T BE à Energy consump-on is at most twice the energy consumed under an ideal policy 42
Impact of Timeout Threshold 43
Impact of Workloads 44
CriDque: Fixed Timeout Ø How to set -meout threshold? Tradeoff between safety and efficiency Works best when workload traces are available Ø Fundamental limita-ons Always waste energy before reaching the -meout threshold Always incur performance penalty for wake up 45
Possible Improvement Ø Predic-ve shutdown shut down immediately when an idle period starts. avoid was-ng energy before reaching the -meout threshold. more efficient, less safe. Ø Predic-ve wakeup wake up when the predicted idle -me expires, even if no new ac-vity has occurred. avoid performance penalty for wakeup. less efficient, safer. 46
PredicDve Shutdown Threshold- based Policy Ø Observa-on: short ac-ve period tends to be followed by long idle period. Ø If ac-ve period < threshold, the following idle period is predicted to be longer than T BE. Ø What is the right threshold? Workload dependent Require offline analysis 47
Threshold- based PredicDve Shutdown 48
PredicDve Wakeup Regression- based Algorithm Ø Predict the length of an idle period based on preceding ac-ve period previous n pairs of idle/ac-ve periods Ø More complicated than fixed -meout Need to maintain history informa-on Ø Depend on offline analysis and traces to determine the regression func-on and parameters 49
Adapt to Workload Changes Ø Grade n -meout thresholds based on history Use the best one for predic-on Use weighted average of n thresholds Ø Adjust -meout Increase -meout threshold if causing too many shutdowns Decrease -meout threshold if causing too few shut downs Ø Stochas-c techniques 50
CriDques: History- based Predictors Ø Depend on short- term correla-on between past & future Hold in many workloads Fail when the correla-on is weak Ø Workload in many embedded systems are more predictable than PCs Workload (e.g., periodic tasks) known a priori Specialized applica-on 51
ESSAT Efficient Sleep Scheduling based on ApplicaDon Timing Ø Ø Reduce radio power consump-on by exploi-ng the -ming proper-es of periodic queries in sensor networks Sleep scheduling incurs low delay penalty O. Chipara, C. Lu, and G.- C. Roman, Efficient Power Management based on Applica-on Timing Seman-cs for Wireless Sensor Networks, ICDCS 2005. 52
Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis-c approach 53
Power Manager Ø Usually implemented in sorware (OS) for flexibility Ø Hardware and sorware co- design SoRware implements policy Hardware implements power saving mechanisms Ø Need standard interfaces to deal with hardware diversity Different vendors Different devices: processor, sensor, controller 54
ACPI Advanced ConfiguraDon and Power Interface Open standard for power management services. h6p://www.acpi.info/ applica-ons device drivers OS kernel ACPI BIOS power management Hardware plaorm devices, processor, chipset 55
ACPI System Power States Used as contract between hardware and OS vendors 56
ACPI Global Power States Ø G3: mechanical off no power consump-on Ø G2: sor off restore requires full OS reboot Ø G1: sleeping state S1: low wake- up latency with no loss of context S2: low latency with loss of CPU/cache state S3: low latency with loss of all state except memory S4: lowest- power state with all devices off Ø G0: working state 57
Intel Core i7 C States 58
Intel PenDum M P states 59
Device Power States Ø Device power state is invisible to the user. Devices may be inac-ve when the system is in the working state. Ø Each device may be controlled by a separate power management policy. 60
Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis-c approach 61
HolisDc View of Power ConsumpDon Ø Instruc-on execu-on (CPU) Ø Cache (instruc-on, data) Ø Main memory Ø Other: non- vola-le memory, display, network interface, I/O devices 62
Mote System view when switching from sleep to ac-ve 2.5 1 10 ms ms typical Source: Joseph Polastre, Robert Szewczyk, Cory Sharp, David Culler. The Mote Revolution: Low Power Wireless Sensor Network Devices. In Hot Chips 16, 2004. 63
Sources of Energy ConsumpDon Rela-ve energy per opera-on (Ca6hoor): memory transfer: 33 external I/O: 10 SRAM write: 9 SRAM read: 4.4 mul-ply: 3.6 add: 1 64
OpDmize Memory System Ø Different instruc-ons à Different energy consump-on Ø Energy: register << cache (SRAM) << memory (DRAM) Ø Op-mizing memory system à significant energy saving 65
Cache Behavior Sweet spot in cache size: Ø Too small: waste energy on memory accesses; Ø Too large: cache itself burns too much power. 66
Impacts of Cache Size 67
OpDmizaDons Ø Reduce memory footprint Reduce code size Analyze/test footprint to find right size: stack, heap Ø Find correct cache size Analyze cache behavior (size of working set) Ø Minimize memory and cache access Use registers efficiently à less cache access Iden-fy and eliminate cache conflicts à less memory access Ø Be6er performance à More idle -me! 68
Reading Ø Textbook 3.7 Ø Ø Required: Sec-ons I, II, III.A, III.B, IV of L. Benini, A. Bogliolo and G. De Micheli, A Survey of Design Techniques for System- Level Dynamic Power Management, IEEE Transac-ons on VLSI, pp. 299-316, June 2000. Interes-ng: Intel Inside Your Smartphone h6p://spectrum.ieee.org/semiconductors/processors/intel- insideyour- smartphone 69