際際滷

際際滷Share a Scribd company logo
8/12/2016 2:41 PM
BioMP: 襦企 弰 覃一企ゼ 語  襦
OpenMP襯 Bionic る 蠍一 蟯 郁規
Geunsik Lim
http://leemgs.fedorapeople.org
Sungkyunkwan University
Samsung Electronics Co., Ltd.
2/13
Introduction
Design and Implementation
Evaluation
Related works
Conclusion
Outline
3/13
Introduction
1) CPU Memory螳  覯 覦伎れ 襴貅伎 ろ伎 煙
2) 螻焔 覦 レ  覯 覦伎れ 覃一伎 豈
3) 蠍一ヾ 企 螳覦  襴貅伎 覃一 蟆曙 襷襦 螳蠍  蠍一 譴
High-
performance
Applications
炎貊 殊 貎朱貊
4/13
Introduction
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Chare Kernel
Threaded-C
JSR-166
STAPL
ECMA CLI
McRT
OpenMP 1.0
OpenMP 2.0
For C/C++
OpenMP 2.5
OpenMP 3.0
OpenMP 3.1
Intel TBB 1.0
Intel TBB 2.0
Intel TBB 3.0
Intel TBB 4.0
Intel Ct
+ RapidMind
Intel ArBB beta
Beta End
Cilk (MIT)
Cilk++
Intel Cilk Plus
CUDA 1.0
OpenCL 1.0
MS PPL
OpenCL 1.2
CUDA 4.1
Intel Parallel Building Blocks
* Source : 蠍襦覯 襦 蟷 TBB , WikiPedia
Evolution of Parallel Programming
OpenMP 4.1 RC1
5/13
Architecture of BioMP
DVFS-aware
OpenMP
Toolchain
for Android
NDK (--enable-
libgomp)
Bionic
Dalvik VM JNI Interface
Multicore H/W
Multicore Aware Android Java Applications
Multicore Scheduler
Functional Library Layer
(C/C++ Native Libraries) Biomp
Auto-Build-
Manaer
ARM
Customizer
Deubgger
Component
Android
NDK8
Automatic
Parallelizer
sysfs
omp_get_thread_num()
6/13
Implementation of BioMP
ARM EABI interface
(ANDROID_LIB_SPEC 
GNU_USER_TARGET_LIB_SPEC )
Linux Android
(Bind with Android C library)
OpenMP for Bionic
(pthread: -lc)
Env. for OpenMP
(Setting with android page size)
Step 1
BioMP Core
DVFS-aware Plug-in
(omp_get_thread_num with sysfs)
Android Application
(By linking biomp)
BioMP Core BioMP Framework
Step 2 Step 3
BioMP Framework BioMP System
7/13
Source codes for BioMP  http://biomp.googlecode.com
8/13
LOCAL_PATH := $(call my-dir)
include $(CLEAR_VARS)
# Here we give our module name and source file(s)
LOCAL_MODULE := stream
LOCAL_SRC_FILES := stream.c
LOCAL_LDLIBS := -lgomp
LOCAL_CFLAGS := -O3 -fopenmp
#include $(BUILD_SHARED_LIBRARY)
include $(BUILD_EXECUTABLE)
How to write a source code
A tests/device/test-openmp/BROKEN_BUILD 1 line
A tests/device/test-openmp/jni/Android.mk 9 lines
A tests/device/test-openmp/jni/Application.mk 1 line
A tests/device/test-openmp/jni/openmp.c 22 lines
9/13
$ cat ./openmptest.c
#include <omp.h> /* for openmap */
#include <stdio.h>
int main (int argc, char *argv[ ]) {
int id, nthreads;
#pragma omp parallel private(id)
{
id = omp_get_thread_num();
printf("Hello World from thread %d n", id);
#pragma omp barrier
if ( id == 0 ) {
nthreads = omp_get_num_threads();
printf("There are %d threadsn",nthreads);
} }
return 0;
} /* end of main() */
How to write a source code
$ ./arm-linux-androideabi-gcc openmptest.c -L
/usr/local/ktoolchain-cortexa9-ver2.5-20120515-bionic/arm-linux-androideabi/lib
-lgomp -o openmptest [ENTER]
$
$ file ./openmptest
./openmptest: ELF 32-bit LSB executable, ARM, version 1 (SYSV),
dynamically linked (uses shared libs), not stripped
$
10/13
An example of 40 pieces of Fibonacci numbers - http://en.wikipedia.org/wiki/Fibonacci_number
How to write a source code for evaluation
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <omp.h>
#include <unistd.h> /* for open/close.. */
#include <fcntl.h> /* for O_RDONLY */
#include <sysioctl.h> /* for ioctl */
#include <systypes.h> /* for lseek() */
int Fibonacci(int n)
{ int x, y;
if (n < 2)
return n;
else {
x = Fibonacci(n - 1);
y = Fibonacci(n - 2);
return (x + y);
} }
int FibonacciTask(int n)
{ int x, y;
if (n < 2)
return n;
else {
#pragma omp task shared(x)
x = Fibonacci(n - 1);
#pragma omp task shared(y)
y = Fibonacci(n - 2);
#pragma omp taskwait
return (x + y);
} }
#define MAX 41
int main(int argc, char * argv[])
{
int FibNumber[MAX] = {0};
struct timeval time_start, time_end;
int i = 0;
// omp related print message
printf("Number of CPUs=%dn", omp_get_num_procs());
printf("Number of max threads=%dn",
omp_get_max_threads());
gettimeofday(&time_start, NULL);
#pragma omp parallel
{
#pragma omp single private(i)
for(i = 1; i < MAX; i++) {
FibNumber[i] = FibonacciTask(i);
}
}
gettimeofday(&time_end, NULL);
time_end.tv_usec = time_end.tv_usec-time_start.tv_usec;
time_end.tv_sec = time_end.tv_sec-time_start.tv_sec;
time_end.tv_usec += (time_end.tv_sec*1000000);
printf("Time of Fibonacci with OpenMP : %lf secn",
time_end.tv_usec / 1000000.0);
for(i = 0; i < MAX; i++)
printf("%d ", FibNumber[i]);
printf("n--------------------------------------
n");
return
(1)
(2)
(3)
11/13
Evaluation
Singlecore Dualcore Quadcore
before 7.4 7.4 7.4
after 7.4 4.3 2.7
0
1
2
3
4
5
6
7
8
ExecutionTime(Seconds)
Fibonacci comparison between before (w/o biomp) and after (w/ biomp)
 40 numbers of Fibonacci sequence
41% reduced
over Singlecore
64% reduced
over Singlecore
12/13
Related Works
Approach Merits Demerits
BioMP  蠍一ヾ 貊襯    .
 BioMP 蠍 伎 弰 覲
蟆曙 . (弰  
襦企 炎覦れ BioMP /JNI襯
 蟲蠍ろ伎 煙 襷ろ
襦 . )
 覦 襴貅伎 覯 BioMP襯
螳ロ.
 ろれ企.
 螻蠍 覲襯 伎 OpenMP  蟾
 讌 .
OpenMP  蠍一ヾ 貊襯    .
 ろれ企.
 襦企 弰 讌讌 .
 螻蠍 覲襯 伎 OpenMP  蟾
 讌 .
Bionic  pthread 襦蠏碁覦 牛 磯 
襴貅伎 Portability襯 殊 .
 ろれ企.
 pthread API襯 伎 覲 襦蠏碁覦
 る 曙 .
 蠍一ヾ 焔 貊襯 覃一伎 襷蟆 
る 襷 螳 .
13/13
Threaded Applications for Task
Parallelism
Future Work
OPenMP on Bionic
CPU0 CPU1 CPU2 CPU3
Online Online Offline Offline
Linux Kernel
T0 T0 T0 T0 T0
CPU DVFS
CPU HotPlug
Threaded Applications for Task
Parallelism
OPenMP on Bionic
CPU0 CPU1 CPU2 CPU3
Online Online Offline Offline
Linux Kernel
T0 T0 T0 T0 T0
CPU DVFS
CPU HotPlug
sysFS
BioMP Agent
It Recognize Temporal
Offline CPUs
OOPS!!! It doesnt
Recognizing Temporal
Offline CPUs
?
/proc/cpuinfo
DVFS-Aware BioMP for Mobile Devices
Galaxy Nexus7 Galaxy Nexus7
User-SpaceKernel-SpaceHardware
User-SpaceKernel-SpaceHardware
14/13
Conclusion
1. BioMP 伎牡 蠍一 蠍磯 覈覦 ろ企襦 襦企
弰 企  豺 .
2. 襦企 NDK Toolkit螻 BioMP襯 伎 覦 襴貅伎 覯
 覲 語企ゼ 煙ろ伎 襦/れ企 ろ 螳ロ
.
3. ろれ企襦, (biomp.googlecode.com) 豢螳 企企ゼ 襦蟆
   .
15/13
Thanks
Any questions?

More Related Content

Viewers also liked (20)

00 002 non_identified_loginpage
00 002 non_identified_loginpage00 002 non_identified_loginpage
00 002 non_identified_loginpage
Laurent Jordi
Hack2o Star Map
Hack2o Star MapHack2o Star Map
Hack2o Star Map
Ann Treacy
Info kelulusan siswa smpn 1 poncol magetan
Info kelulusan siswa smpn 1 poncol magetanInfo kelulusan siswa smpn 1 poncol magetan
Info kelulusan siswa smpn 1 poncol magetan
snespo
Referencies Web Gualbert
Referencies Web GualbertReferencies Web Gualbert
Referencies Web Gualbert
Gualbertus Vargas
Tna
TnaTna
Tna
Shakira5
Ta krisen i mobilen
Ta krisen i mobilenTa krisen i mobilen
Ta krisen i mobilen
Jans辰ter Kommunikation AB
Giovani Talenti della Basilicata
Giovani Talenti della BasilicataGiovani Talenti della Basilicata
Giovani Talenti della Basilicata
Nicoletta Iacobacci
j-s-finley-cover-letterj-s-finley-cover-letter
j-s-finley-cover-letter
John Stacy Finley
Grupos micro2 13
Grupos micro2 13Grupos micro2 13
Grupos micro2 13
Luis Zurita
CERTIFICATE DEGREECERTIFICATE DEGREE
CERTIFICATE DEGREE
Tush Daniel
SaraySaray
Saray
saray marin
Amanda 3 b
Amanda 3 bAmanda 3 b
Amanda 3 b
Marga Bio
Em vull con竪ixer
Em vull con竪ixerEm vull con竪ixer
Em vull con竪ixer
vallterrics

Similar to kics2013-winter-biomp-slide-20130127-1340 (20)

Remote-debugging-based-on-notrace32-20130619-1900
Remote-debugging-based-on-notrace32-20130619-1900Remote-debugging-based-on-notrace32-20130619-1900
Remote-debugging-based-on-notrace32-20130619-1900
Samsung Electronics
6. code level reversing
6. code level reversing6. code level reversing
6. code level reversing
Youngjun Chang
襦企 覦伎 殊企 蠍磯
襦企  覦伎 殊企  蠍磯襦企  覦伎 殊企  蠍磯
襦企 覦伎 殊企 蠍磯
chon2010
[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱 襦朱 蠍
[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱  襦朱 蠍[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱  襦朱 蠍
[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱 襦朱 蠍
Jaeseung Ha
襴朱Μ 1 碁碁: "襴, 襦 螻 貊!"
襴朱Μ  1  碁碁: "襴, 襦 螻 貊!"襴朱Μ  1  碁碁: "襴, 襦 螻 貊!"
襴朱Μ 1 碁碁: "襴, 襦 螻 貊!"
襴朱Μ
襷螻100 覲企襦 覲伎 9
襷螻100 覲企襦 覲伎 9襷螻100 覲企襦 覲伎 9
襷螻100 覲企襦 覲伎 9
cross compile
cross compilecross compile
cross compile
he4722
豐覲 螳覦/れ ろ 碁
豐覲 螳覦/れ  ろ 碁 豐覲 螳覦/れ  ろ 碁
豐覲 螳覦/れ ろ 碁
YoungSu Son
螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?)
 螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?) 螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?)
螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?)
Covenant Ko
[譟一]Kgc2012 c++amp
[譟一]Kgc2012 c++amp[譟一]Kgc2012 c++amp
[譟一]Kgc2012 c++amp
讌 譟
Linux Kernel Boot Process , SOSCON 2015, By Mario Cho
Linux Kernel Boot Process , SOSCON 2015, By Mario ChoLinux Kernel Boot Process , SOSCON 2015, By Mario Cho
Linux Kernel Boot Process , SOSCON 2015, By Mario Cho
Mario Cho
[232] メ釈梶釈メ求=求堰メ=氏 釈≡
[232] メ釈梶釈メ求=求堰メ=氏 釈≡[232] メ釈梶釈メ求=求堰メ=氏 釈≡
[232] メ釈梶釈メ求=求堰メ=氏 釈≡
NAVER D2
[KGC2014] 襷襴 朱ゼ ♀鍵 C++ - C# 狩 覃壱 蟆 ろ豌 り
[KGC2014]  襷襴 朱ゼ ♀鍵  C++ - C#  狩 覃壱 蟆 ろ豌 り[KGC2014]  襷襴 朱ゼ ♀鍵  C++ - C#  狩 覃壱 蟆 ろ豌 り
[KGC2014] 襷襴 朱ゼ ♀鍵 C++ - C# 狩 覃壱 蟆 ろ豌 り
Sungkyun Kim
NDC 2017 NEXON ZERO (レ 襦) 蟆 れ螳朱 貊 覦 蟆 覲 讌蠍
NDC 2017  NEXON ZERO (レ 襦) 蟆 れ螳朱 貊  覦 蟆 覲 讌蠍NDC 2017  NEXON ZERO (レ 襦) 蟆 れ螳朱 貊  覦 蟆 覲 讌蠍
NDC 2017 NEXON ZERO (レ 襦) 蟆 れ螳朱 貊 覦 蟆 覲 讌蠍
Jaeseung Ha
Hideroot - Inc0gnito 2016
Hideroot - Inc0gnito 2016Hideroot - Inc0gnito 2016
Hideroot - Inc0gnito 2016
perillamint
襦企 る
襦企  る襦企  る
襦企 る
Peter YoungSik Yun
襷螻100 覃伎 蟆暑壱蟾讌-2011-0324
襷螻100 覃伎 蟆暑壱蟾讌-2011-0324襷螻100 覃伎 蟆暑壱蟾讌-2011-0324
襷螻100 覃伎 蟆暑壱蟾讌-2011-0324
覈覦 焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101)
覈覦  焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101) 覈覦  焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101)
覈覦 焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101)
YoungSu Son
蟆 覯 螳覦 蟯 覲 Node.js ル螻
蟆 覯 螳覦 蟯 覲 Node.js ル螻 蟆 覯 螳覦 蟯 覲 Node.js ル螻
蟆 覯 螳覦 蟯 覲 Node.js ル螻
Jeongsang Baek
襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP 螳覦 瑚襦~
襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP  螳覦 瑚襦~襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP  螳覦 瑚襦~
襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP 螳覦 瑚襦~
YEONG-CHEON YOU
Remote-debugging-based-on-notrace32-20130619-1900
Remote-debugging-based-on-notrace32-20130619-1900Remote-debugging-based-on-notrace32-20130619-1900
Remote-debugging-based-on-notrace32-20130619-1900
Samsung Electronics
6. code level reversing
6. code level reversing6. code level reversing
6. code level reversing
Youngjun Chang
襦企 覦伎 殊企 蠍磯
襦企  覦伎 殊企  蠍磯襦企  覦伎 殊企  蠍磯
襦企 覦伎 殊企 蠍磯
chon2010
[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱 襦朱 蠍
[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱  襦朱 蠍[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱  襦朱 蠍
[NDC2015] 語 企 襦朱 螳ロ 貊れ JYP 炎鍵 - 殊企 蟆 覦壱 襦朱 蠍
Jaeseung Ha
襴朱Μ 1 碁碁: "襴, 襦 螻 貊!"
襴朱Μ  1  碁碁: "襴, 襦 螻 貊!"襴朱Μ  1  碁碁: "襴, 襦 螻 貊!"
襴朱Μ 1 碁碁: "襴, 襦 螻 貊!"
襴朱Μ
襷螻100 覲企襦 覲伎 9
襷螻100 覲企襦 覲伎 9襷螻100 覲企襦 覲伎 9
襷螻100 覲企襦 覲伎 9
cross compile
cross compilecross compile
cross compile
he4722
豐覲 螳覦/れ ろ 碁
豐覲 螳覦/れ  ろ 碁 豐覲 螳覦/れ  ろ 碁
豐覲 螳覦/れ ろ 碁
YoungSu Son
螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?)
 螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?) 螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?)
螳覦 貉危郁概 螻給伎 蟾? (觜螻旧 貉危郁概 螻朱 覓伎 狩蟾?)
Covenant Ko
[譟一]Kgc2012 c++amp
[譟一]Kgc2012 c++amp[譟一]Kgc2012 c++amp
[譟一]Kgc2012 c++amp
讌 譟
Linux Kernel Boot Process , SOSCON 2015, By Mario Cho
Linux Kernel Boot Process , SOSCON 2015, By Mario ChoLinux Kernel Boot Process , SOSCON 2015, By Mario Cho
Linux Kernel Boot Process , SOSCON 2015, By Mario Cho
Mario Cho
[232] メ釈梶釈メ求=求堰メ=氏 釈≡
[232] メ釈梶釈メ求=求堰メ=氏 釈≡[232] メ釈梶釈メ求=求堰メ=氏 釈≡
[232] メ釈梶釈メ求=求堰メ=氏 釈≡
NAVER D2
[KGC2014] 襷襴 朱ゼ ♀鍵 C++ - C# 狩 覃壱 蟆 ろ豌 り
[KGC2014]  襷襴 朱ゼ ♀鍵  C++ - C#  狩 覃壱 蟆 ろ豌 り[KGC2014]  襷襴 朱ゼ ♀鍵  C++ - C#  狩 覃壱 蟆 ろ豌 り
[KGC2014] 襷襴 朱ゼ ♀鍵 C++ - C# 狩 覃壱 蟆 ろ豌 り
Sungkyun Kim
NDC 2017 NEXON ZERO (レ 襦) 蟆 れ螳朱 貊 覦 蟆 覲 讌蠍
NDC 2017  NEXON ZERO (レ 襦) 蟆 れ螳朱 貊  覦 蟆 覲 讌蠍NDC 2017  NEXON ZERO (レ 襦) 蟆 れ螳朱 貊  覦 蟆 覲 讌蠍
NDC 2017 NEXON ZERO (レ 襦) 蟆 れ螳朱 貊 覦 蟆 覲 讌蠍
Jaeseung Ha
Hideroot - Inc0gnito 2016
Hideroot - Inc0gnito 2016Hideroot - Inc0gnito 2016
Hideroot - Inc0gnito 2016
perillamint
襷螻100 覃伎 蟆暑壱蟾讌-2011-0324
襷螻100 覃伎 蟆暑壱蟾讌-2011-0324襷螻100 覃伎 蟆暑壱蟾讌-2011-0324
襷螻100 覃伎 蟆暑壱蟾讌-2011-0324
覈覦 焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101)
覈覦  焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101) 覈覦  焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101)
覈覦 焔 覿 覦覯 101 (Mobile Application Performance Analysis Methodology 101)
YoungSu Son
蟆 覯 螳覦 蟯 覲 Node.js ル螻
蟆 覯 螳覦 蟯 覲 Node.js ル螻 蟆 覯 螳覦 蟯 覲 Node.js ル螻
蟆 覯 螳覦 蟯 覲 Node.js ル螻
Jeongsang Baek
襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP 螳覦 瑚襦~
襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP  螳覦 瑚襦~襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP  螳覦 瑚襦~
襦蠏碁覦 語伎 F1襾語 C++ 螻 Windows 10 UWP 螳覦 瑚襦~
YEONG-CHEON YOU

More from Samsung Electronics (7)

Samsung ARM Chromebook1/2 (for Hackers & System Developers)
Samsung ARM Chromebook1/2 (for Hackers & System Developers)Samsung ARM Chromebook1/2 (for Hackers & System Developers)
Samsung ARM Chromebook1/2 (for Hackers & System Developers)
Samsung Electronics
Distributed Build to Speed-up Compilation of Tizen Package
Distributed Build to Speed-up Compilation of Tizen PackageDistributed Build to Speed-up Compilation of Tizen Package
Distributed Build to Speed-up Compilation of Tizen Package
Samsung Electronics
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
Samsung Electronics
gcce-uapm-slide-20131001-1900
gcce-uapm-slide-20131001-1900gcce-uapm-slide-20131001-1900
gcce-uapm-slide-20131001-1900
Samsung Electronics
distcom-short-20140112-1600
distcom-short-20140112-1600distcom-short-20140112-1600
distcom-short-20140112-1600
Samsung Electronics
UNAS-20140123-1800
UNAS-20140123-1800UNAS-20140123-1800
UNAS-20140123-1800
Samsung Electronics
booting-booster-final-20160420-0700
booting-booster-final-20160420-0700booting-booster-final-20160420-0700
booting-booster-final-20160420-0700
Samsung Electronics
Samsung ARM Chromebook1/2 (for Hackers & System Developers)
Samsung ARM Chromebook1/2 (for Hackers & System Developers)Samsung ARM Chromebook1/2 (for Hackers & System Developers)
Samsung ARM Chromebook1/2 (for Hackers & System Developers)
Samsung Electronics
Distributed Build to Speed-up Compilation of Tizen Package
Distributed Build to Speed-up Compilation of Tizen PackageDistributed Build to Speed-up Compilation of Tizen Package
Distributed Build to Speed-up Compilation of Tizen Package
Samsung Electronics
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
Samsung Electronics
gcce-uapm-slide-20131001-1900
gcce-uapm-slide-20131001-1900gcce-uapm-slide-20131001-1900
gcce-uapm-slide-20131001-1900
Samsung Electronics
booting-booster-final-20160420-0700
booting-booster-final-20160420-0700booting-booster-final-20160420-0700
booting-booster-final-20160420-0700
Samsung Electronics

kics2013-winter-biomp-slide-20130127-1340

  • 1. 8/12/2016 2:41 PM BioMP: 襦企 弰 覃一企ゼ 語 襦 OpenMP襯 Bionic る 蠍一 蟯 郁規 Geunsik Lim http://leemgs.fedorapeople.org Sungkyunkwan University Samsung Electronics Co., Ltd.
  • 3. 3/13 Introduction 1) CPU Memory螳 覯 覦伎れ 襴貅伎 ろ伎 煙 2) 螻焔 覦 レ 覯 覦伎れ 覃一伎 豈 3) 蠍一ヾ 企 螳覦 襴貅伎 覃一 蟆曙 襷襦 螳蠍 蠍一 譴 High- performance Applications 炎貊 殊 貎朱貊
  • 4. 4/13 Introduction 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Chare Kernel Threaded-C JSR-166 STAPL ECMA CLI McRT OpenMP 1.0 OpenMP 2.0 For C/C++ OpenMP 2.5 OpenMP 3.0 OpenMP 3.1 Intel TBB 1.0 Intel TBB 2.0 Intel TBB 3.0 Intel TBB 4.0 Intel Ct + RapidMind Intel ArBB beta Beta End Cilk (MIT) Cilk++ Intel Cilk Plus CUDA 1.0 OpenCL 1.0 MS PPL OpenCL 1.2 CUDA 4.1 Intel Parallel Building Blocks * Source : 蠍襦覯 襦 蟷 TBB , WikiPedia Evolution of Parallel Programming OpenMP 4.1 RC1
  • 5. 5/13 Architecture of BioMP DVFS-aware OpenMP Toolchain for Android NDK (--enable- libgomp) Bionic Dalvik VM JNI Interface Multicore H/W Multicore Aware Android Java Applications Multicore Scheduler Functional Library Layer (C/C++ Native Libraries) Biomp Auto-Build- Manaer ARM Customizer Deubgger Component Android NDK8 Automatic Parallelizer sysfs omp_get_thread_num()
  • 6. 6/13 Implementation of BioMP ARM EABI interface (ANDROID_LIB_SPEC GNU_USER_TARGET_LIB_SPEC ) Linux Android (Bind with Android C library) OpenMP for Bionic (pthread: -lc) Env. for OpenMP (Setting with android page size) Step 1 BioMP Core DVFS-aware Plug-in (omp_get_thread_num with sysfs) Android Application (By linking biomp) BioMP Core BioMP Framework Step 2 Step 3 BioMP Framework BioMP System
  • 7. 7/13 Source codes for BioMP http://biomp.googlecode.com
  • 8. 8/13 LOCAL_PATH := $(call my-dir) include $(CLEAR_VARS) # Here we give our module name and source file(s) LOCAL_MODULE := stream LOCAL_SRC_FILES := stream.c LOCAL_LDLIBS := -lgomp LOCAL_CFLAGS := -O3 -fopenmp #include $(BUILD_SHARED_LIBRARY) include $(BUILD_EXECUTABLE) How to write a source code A tests/device/test-openmp/BROKEN_BUILD 1 line A tests/device/test-openmp/jni/Android.mk 9 lines A tests/device/test-openmp/jni/Application.mk 1 line A tests/device/test-openmp/jni/openmp.c 22 lines
  • 9. 9/13 $ cat ./openmptest.c #include <omp.h> /* for openmap */ #include <stdio.h> int main (int argc, char *argv[ ]) { int id, nthreads; #pragma omp parallel private(id) { id = omp_get_thread_num(); printf("Hello World from thread %d n", id); #pragma omp barrier if ( id == 0 ) { nthreads = omp_get_num_threads(); printf("There are %d threadsn",nthreads); } } return 0; } /* end of main() */ How to write a source code $ ./arm-linux-androideabi-gcc openmptest.c -L /usr/local/ktoolchain-cortexa9-ver2.5-20120515-bionic/arm-linux-androideabi/lib -lgomp -o openmptest [ENTER] $ $ file ./openmptest ./openmptest: ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), not stripped $
  • 10. 10/13 An example of 40 pieces of Fibonacci numbers - http://en.wikipedia.org/wiki/Fibonacci_number How to write a source code for evaluation #include <stdio.h> #include <stdlib.h> #include <math.h> #include <omp.h> #include <unistd.h> /* for open/close.. */ #include <fcntl.h> /* for O_RDONLY */ #include <sysioctl.h> /* for ioctl */ #include <systypes.h> /* for lseek() */ int Fibonacci(int n) { int x, y; if (n < 2) return n; else { x = Fibonacci(n - 1); y = Fibonacci(n - 2); return (x + y); } } int FibonacciTask(int n) { int x, y; if (n < 2) return n; else { #pragma omp task shared(x) x = Fibonacci(n - 1); #pragma omp task shared(y) y = Fibonacci(n - 2); #pragma omp taskwait return (x + y); } } #define MAX 41 int main(int argc, char * argv[]) { int FibNumber[MAX] = {0}; struct timeval time_start, time_end; int i = 0; // omp related print message printf("Number of CPUs=%dn", omp_get_num_procs()); printf("Number of max threads=%dn", omp_get_max_threads()); gettimeofday(&time_start, NULL); #pragma omp parallel { #pragma omp single private(i) for(i = 1; i < MAX; i++) { FibNumber[i] = FibonacciTask(i); } } gettimeofday(&time_end, NULL); time_end.tv_usec = time_end.tv_usec-time_start.tv_usec; time_end.tv_sec = time_end.tv_sec-time_start.tv_sec; time_end.tv_usec += (time_end.tv_sec*1000000); printf("Time of Fibonacci with OpenMP : %lf secn", time_end.tv_usec / 1000000.0); for(i = 0; i < MAX; i++) printf("%d ", FibNumber[i]); printf("n-------------------------------------- n"); return (1) (2) (3)
  • 11. 11/13 Evaluation Singlecore Dualcore Quadcore before 7.4 7.4 7.4 after 7.4 4.3 2.7 0 1 2 3 4 5 6 7 8 ExecutionTime(Seconds) Fibonacci comparison between before (w/o biomp) and after (w/ biomp) 40 numbers of Fibonacci sequence 41% reduced over Singlecore 64% reduced over Singlecore
  • 12. 12/13 Related Works Approach Merits Demerits BioMP 蠍一ヾ 貊襯 . BioMP 蠍 伎 弰 覲 蟆曙 . (弰 襦企 炎覦れ BioMP /JNI襯 蟲蠍ろ伎 煙 襷ろ 襦 . ) 覦 襴貅伎 覯 BioMP襯 螳ロ. ろれ企. 螻蠍 覲襯 伎 OpenMP 蟾 讌 . OpenMP 蠍一ヾ 貊襯 . ろれ企. 襦企 弰 讌讌 . 螻蠍 覲襯 伎 OpenMP 蟾 讌 . Bionic pthread 襦蠏碁覦 牛 磯 襴貅伎 Portability襯 殊 . ろれ企. pthread API襯 伎 覲 襦蠏碁覦 る 曙 . 蠍一ヾ 焔 貊襯 覃一伎 襷蟆 る 襷 螳 .
  • 13. 13/13 Threaded Applications for Task Parallelism Future Work OPenMP on Bionic CPU0 CPU1 CPU2 CPU3 Online Online Offline Offline Linux Kernel T0 T0 T0 T0 T0 CPU DVFS CPU HotPlug Threaded Applications for Task Parallelism OPenMP on Bionic CPU0 CPU1 CPU2 CPU3 Online Online Offline Offline Linux Kernel T0 T0 T0 T0 T0 CPU DVFS CPU HotPlug sysFS BioMP Agent It Recognize Temporal Offline CPUs OOPS!!! It doesnt Recognizing Temporal Offline CPUs ? /proc/cpuinfo DVFS-Aware BioMP for Mobile Devices Galaxy Nexus7 Galaxy Nexus7 User-SpaceKernel-SpaceHardware User-SpaceKernel-SpaceHardware
  • 14. 14/13 Conclusion 1. BioMP 伎牡 蠍一 蠍磯 覈覦 ろ企襦 襦企 弰 企 豺 . 2. 襦企 NDK Toolkit螻 BioMP襯 伎 覦 襴貅伎 覯 覲 語企ゼ 煙ろ伎 襦/れ企 ろ 螳ロ . 3. ろれ企襦, (biomp.googlecode.com) 豢螳 企企ゼ 襦蟆 .

Editor's Notes

  • #2: Migration of legacy software 2012 12 26 蠍一朱 AOSP Repository Merging 襭 Android 4.1 (JellyBean) れ 覯 BioMP (Bionic + OpenMP) 螳
  • #3: Thread scheduling framework
  • #5: http://gcc.gnu.org/projects/gomp/ https://wiki.linaro.org/Platform/Android/ImprovingSMP Android NDK OpenMP襯 蟆 豕蠏殊 蟲一. http://groups.google.com/group/android-ndk/browse_thread/thread/a547eac5446035b4 貊 伎碁, gcc 伎牡語 4.6朱 伎 企, 4.4.3 submit 讌 蟆 螳給. https://android-review.googlesource.com/#/c/34491/ Odroid OpneMP 朱慨豺 伎 焔 豸′ 一危郁 給. http://com.odroid.com/sigong/nf_board/nboard_view.php?brd_id=freeboard&bid=1059 豺碁襦企油OpenMP 讌油gcc 4.6.1 伎牡語 襴企Μ讀給. http://www.kandroid.org/board/board.php?board=toolchain&command=body&no=15
  • #6: /sys/devices/cpu/possible
  • #7: This patch enables OpenMP support in the Android NDK by modifying the following: Specify that pthreads are supported in -lc instead of -lpthread (linux-android.h) Change the order of ANDROID_LIB_SPEC and LINUX_TARGET_LIB_SPEC so the above change will take precedence (linux-eabi.h) Modified autoconf for libgomp to check to see if the pthread libraries exist in libc (configure.ac) Added include to env.c so PAGE_SIZE is defined. To enable these changes, add "--enable-libgomp" to configure command under build-gcc.sh.
  • #8: https://android-review.googlesource.com/#/c/48617/ http://source.android.com/source/submit-patches.html https://android-review.googlesource.com/#/settings/new-agreement https://android-review.googlesource.com/#/admin/projects/ repo init -u https://android.googlesource.com/toolchain/manifest https://android-review.googlesource.com/#/ Patch list [invain@invain gcc]$ git commit -s [gcc-geunsik 92c478d] Support OpenMP for task parallelism on Android-ICS/GCC-4.6.3 5 files changed, 63 insertions(+), 4 deletions(-) [invain@invain gcc]$ ../repo upload Upload project gcc/ to remote branch master: branch gcc-geunsik ( 1 commit, Tue Dec 25 14:28:35 2012 +0900): 92c478db Support OpenMP for task parallelism on Android-ICS/GCC-4.6.3 to https://android-review.googlesource.com/ (y/N)? y Counting objects: 23, done. Delta compression using up to 4 threads. Compressing objects: 100% (12/12), done. Writing objects: 100% (12/12), 2.00 KiB, done. Total 12 (delta 11), reused 0 (delta 0) remote: Receiving objects: 100% (12/12) remote: Resolving deltas: 100% (11/11) remote: Processing changes: new: 1, done remote: remote: New Changes: remote: https://android-review.googlesource.com/48617 remote: To https://android-review.googlesource.com/p/toolchain/gcc * [new branch] gcc-geunsik -> refs/for/master ---------------------------------------------------------------------- [FAILED] gcc/ gcc-geunsik (Upload failed) [invain@invain gcc]$
  • #11: http://odroid.foros-phpbb.com/t1008-openmp-test-with-odroid-a4 :
  • #12: 炎貊伎 殊 螳 觜蟲襯 企慨給. 朱慨豺 40螳襯 蟲 蠏語(Recursive) 一一. 殊願 炎貊伎 觜 1.6覦 觜襯 蟆郁骸襯 覲伎譯手 給. ろ碁 襦企 螳覦覲企 ODROID-A4 蟆. ろ碁ゼ 讌蠍一 貉る DVFS 旧 蟶殊 . 覦覯 襷 覓語 16~17讓曙 谿瑚語. ODROID-PC 覲蟆曙 螳ロ. http://com.odroid.com/sigong/nf_file_board/nfile_board_view.php?keyword=&tag=&bid=96 root@android:/system/bin # ./Fibonacci Time of Fibonacci with recursion : 7.862591 sec 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 102334155 -------------------------------------- root@android:/system/bin # ./Fibonacci_omp Number of CPUs=2 Number of max threads=2 Time of Fibonacci with OpenMP : 4.878260 sec 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 102334155 --------------------------------------
  • #14: 1) Encreasing cpu utilization ---> omp_get_num_procs() --> available CPUs ---> If we use CPU-Hotplug + CPU-DVFS 2) _SC_NPROCESSORS_ONLN (MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris, Cygwin, Haiku) --> /proc/stat (scanning : performance problem) 3) _SC_NPROCESSORS_CONF (MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris, Cygwin, Haiku) --> /proc/cpuinfo (scanning : performance problem) 4) sc_nprocessors_dvfs() ---> sysfs-devices-system-cpu (for performance and security) --> /sys/device/system/cpu/possible 蠏碁磯 覈覦 豌 CPU Hot-plug / on-demand 蟯 貉る 旧 貅覃 覓語螳 れ. 讌 貊企 覈 譯曙伎 omp_get_thread_num() 1 return. 覓朱 NDK 覃一企ゼ 譯朱 螻 Multi-threading 覃 る, 企 蟆曙 OpenMP襯 譯. 磯殊 襷貅 OpenMP襯 伎 螻焔 煙 襴 蟆 曙 蟆 螳れ. 貊企ゼ 蟾語 trigger 螳 API螳 ? 螳 Thread襯 500~700msec り 覃, Hot-plug螳 螳 覈 CPU螳 伎螻 企 OpenMP襯 豐蠍壱覃 螳ロ蠍 れ. 覓語 譟一覲/豺覲/貉る覯覲襦 Hot-plug 蠍磯 蟆豢 Load detect Threshold 螳 誤煙 ろ瑚 襷 覲伎. 蠏朱蓋 覦覯 朱 譬蟆給. 螳 Thread襯 500~700msec り 覃, Hot-plug螳 螳 覈 CPU螳 伎螻 企 OpenMP襯 豐蠍壱覃 螳ロ蠍 れ. 覓語 譟一覲/豺覲/貉る覯覲襦 Hot-plug 蠍磯 蟆豢 Load detect Threshold 螳 誤煙 ろ瑚 襷 覲伎. omp_get_thread_num() 覦 enbaled CPU螳襦 覲讌 襷螻, 蠏碁, 覓朱Μ cpu螳 螳 螻企, 蠏 螳襦 覦襦 omp_***()覲襯 企慨朱 . 伎 磯螳 貉る覯 谿朱 cpu襯 on覃伎 襦覦碁一 覩襦.... 蠏碁, 蠍一ヾ 伎牡語 openmp 覦 蠍一ヾ open_mp 朱 enabled cpu螳螳 , 覓朱Μ cpu螳襯 覦貅 伎朱 覯 覈覦 cpu 蟆 覦覯 覈譴. るジ 覦覯る 朱, 螳ロ kandroid toolchain (for bionic) 磯 蟆襷朱 願屋 螳ロ蟆 覿覿 伎 覿ク 覲願 . omp_get_thread_num() 襴願 覓伎螻, 貊 螳襯 2螳 4螳手 螳螻 る慨 れ. 蠏碁磯 貊願 2螳 襦語 4螳手 螳伎 襴覃 れ 焔レ 伎 覓語螳 給. 譬 蟲 ろ瑚 覲伎企れ. 讌讌 螳伎 襴 覲願給.