15. データ配置の最適化
?
L1 L2 外部メモリのレイテンシ
ARM Connect Community Technical Symposium 2009
System Level Benchmarking Analysisof the Cortex?-A9 MPCore?
http://www.ruhr-uni-bochum.de/integriertesysteme/emuco/files/System_Level_Benchmarking_Analysis_of_the_Cortex_A9_MPCore.pdf
15
横浜PF部第33回勉強会
2013/10/14
16. 周波数/電圧の制御APIの埋め込み
?
並列化かしたtaskの隙間に電力制御APIを埋め込む
?
?
?
Dynamic Voltage and Frequency Scaling (DVFS)で部分的に
ゆっくり?消費電力を落として動作
Power gating: Power を落とす
Clock gating: Clockを止める
Core0
Core1
Core2
Core0
MT1
MT2
time
MT5
MT3
MT4
MT6
MT8
MT7
MT9
Power
gating
Power
gating
MT2
MT1
Core1
MT3
MT4
(Low freq.)
time
Core2
MT5
(Low freq.)
MT6
MT7
Power
gating
MT8
MT9
Clock
Time
management
Margin
gating
Given Dead Line
16
横浜PF部第33回勉強会
2013/10/14
26. Tegra-3のhotplug governor
Edp_Thermal
Suspend
tegra_auto_hotplug_governor
Auto Hot plug
CpuFreq
parameters
LP-mode
GP-MODE
up_delay
up2dn_delay
down_delay
down_deley
down_delay
top_freq
idle_top_freq
idle_bottom_freq
botttom_freq
tegra_cpu_set_speed_cap
up2g0_delay
0
idle_bottom_freq
Throttle_table
Update form user
26
New
State
Delay to effecte
IDLE
> top_freq
UP
Up_delay
IDLE
<=bottom_freq
DOWN
Down_delay
DOWN
>top_freq
UP
Up_delay
DOWN
>bottom_freq
IDLE
NA
<bottom_freq
DOWN
Down_delay
UP
throttle_index
Compare with
requested freq
UP
578 int tegra_cpu_set_speed_cap(unsigned int *speed_cap)
579 {
581
unsigned int new_speed = tegra_cpu_highest_speed();
586
new_speed = tegra_throttle_governor_speed(new_speed);
587
new_speed = edp_governor_speed(new_speed);
588
new_speed = user_cap_speed(new_speed);
592
ret = tegra_update_cpu_speed(new_speed);
594
tegra_auto_hotplug_governor(new_speed, false);
596 }
Current
State
<=top_freq
IDLE
ND
thermal_cooling_device
横浜PF部第33回勉強会
2013/10/14
32. Clock up transit
<7>[ 942.369161] notification 0 of frequency transition to 1200000 kHz
<7>[ 942.369500] notification 0 of frequency transition to 1200000 kHz
<7>[ 942.369685] notification 0 of frequency transition to 1200000 kHz
<7>[ 942.370010] notification 0 of frequency transition to 1200000 kHz
<7>[ 942.370193] cpufreq-tegra: transition: 340000 --> 1200000
<7>[ 942.370555] regulator regulator.2: set_voltage: name=max77663_sd1, min_uV=1100000, max_uV=1350000
<7>[ 942.371086] regulator regulator.1: set_voltage: name=max77663_sd0, min_uV=900000, max_uV=1250000
<7>[ 942.371467] regulator regulator.2: set_voltage: name=max77663_sd1, min_uV=1200000, max_uV=1350000
<7>[ 942.371985] regulator regulator.1: set_voltage: name=max77663_sd0, min_uV=1000000, max_uV=1250000
<7>[ 942.372505] regulator regulator.1: set_voltage: name=max77663_sd0, min_uV=1025000, max_uV=1250000
5ms
<7>[ 942.373135] notification 1 of frequency transition to 1200000 kHz
<7>[ 942.373209] FREQ: 1200000 - CPU: 0
<7>[ 942.373345] notification 1 of frequency transition to 1200000 kHz
<7>[ 942.373483] FREQ: 1200000 - CPU: 1
<7>[ 942.373561] notification 1 of frequency transition to 1200000 kHz
<7>[ 942.373756] FREQ: 1200000 - CPU: 2
<7>[ 942.373832] notification 1 of frequency transition to 1200000 kHz
<7>[ 942.374027] FREQ: 1200000 - CPU: 3
32
横浜PF部第33回勉強会
2013/10/14
33. Clock down transit
<7>[ 1035.045405] notification 0 of frequency transition to 1000000 kHz
<7>[ 1035.045529] notification 0 of frequency transition to 1000000 kHz
<7>[ 1035.045591] notification 0 of frequency transition to 1000000 kHz
<7>[ 1035.045702] notification 0 of frequency transition to 1000000 kHz
<7>[ 1035.045763] cpufreq-tegra: transition: 1200000 --> 1000000
<7>[ 1035.046042] regulator regulator.1: set_voltage: name=max77663_sd0, min_uV=975000, max_uV=1250000
<7>[ 1035.046315] notification 1 of frequency transition to 1000000 kHz
2ms
<7>[ 1035.046387] FREQ: 1000000 - CPU: 0
<7>[ 1035.046462] notification 1 of frequency transition to 1000000 kHz
<7>[ 1035.046593] FREQ: 1000000 - CPU: 1
<7>[ 1035.046669] notification 1 of frequency transition to 1000000 kHz
<7>[ 1035.046857] FREQ: 1000000 - CPU: 2
<7>[ 1035.046929] notification 1 of frequency transition to 1000000 kHz
<7>[ 1035.047116] FREQ: 1000000 - CPU: 3
<7>[ 1035.047352] regulator regulator.2: set_voltage: name=max77663_sd1, min_uV=1100000, max_uV=1350000
33
横浜PF部第33回勉強会
2013/10/14
34. おまけ Systelcall cost
?
getpid(2) 1M times
?
?
?
One CPU, ON & Off line 1K times
?
?
# time ./scalltest p M
0m0.19s real
0m0.02s user
0m0.18s system
190000000ns / 1000000 = 190ns
Clock 1.2GHz
# time ./scalltest 0 h K
0m23.43s real
0m0.00s user
23430 ms / 1000 = 23 ms
0m1.49s system
Three CPU, ON & Off line 1K times
?
34
#time ./scalltest 0 H K
0m31.01s real
0m0.00s user
30s / 1000 = 30ms
0m4.77s system
横浜PF部第33回勉強会
2013/10/14
57. Power Rail and Measurement point
for Nexus 7
USB 5V
A
Battery
I2
C
slave
B
Bat-mgr
C
DC-DC
PMIC
PS63020
MAX77612
Unregulated
VBAT (typ. 3.7v)
3.0v
I2C
slave
SOC
0.8v3.3v
I2C
master
Valuable
A)
B)
C)
57
VBAT の出力
DC-DCの出力
PMICの出力
横浜PF部第33回勉強会
2013/10/14
76. 電力測定 MPEG2 Decoder 1コア
MPEG2 Decode execution
In high clock and voltage
1.7GHz, 1.4V
Clock gating
by WFI
Busy Wait execution
1.7GHz, 1.4V
(a) Without Power Reduction Control
(b) With Power Reduction Control
Reduced
by WFI
76
横浜PF部第33回勉強会
Reduced
2013/10/14
Consumed
77. 電力測定 MPEG2 Decoder 3コア
Busy Wait execution
MPEG2 Decode execution
In low clock and voltage
MPEG2 Decode execution
In high clock and voltage
1.7GHz, 1.4V
400MHz, 1.05V
Clock gating
by WFI
200MHz, 0.92V
(a) Without Power Reduction Control
(b) With Power Reduction Control
DVFS
P = n*f*c*V^2
77
横浜PF部第33回勉強会
Reduced
by WFI
Reduced
2013/10/14
Consumed