際際滷

際際滷Share a Scribd company logo
www.modeliosoft.com
Towards Modeling Approach
Enabling Efficient Platform for
Heterogeneous Big Data Analysis
Andrey.Sadovykh@softeam.fr
1
Outlines
Introduction
Model-driven
development
Big Data
Juniper
Case-study
results
Conclusions
www.modeliosoft.com 2
20 ME
2006
17,5 ME
2005
70 ME
2013
Paris
Rennes
Nantes
Sophia
SOFTEAM C a French IT services / Software vendor
?SOFTEAM, a growing
company
? 25 years¨ experience
? 850 experts
? Regular growth
? Specialist in OO technologies,
new architectures,
methodologies
? Banking, Defense, Telecom, ´
www.modeliosoft.com 3
23 ME
2008
Modelio for Software
and System Engineering
? UML editor with 20 years¨ history
o CloudML
o SysML
o MARTE
o Code generation
o Documentation
o Teamwork
www.modeliosoft.com 4
? Available under open
source at Modelio.org
MODEL-DRIVEN DEVELOPMENT
www.modeliosoft.com 5
It is all about models ´ Starting with UML
www.modeliosoft.com 6
Requirements
UML Use
Cases
Architecture
UML
Components
and Classes
Design
Refined
Classes
or Domain
Specific
Language
Implementation
Code
generation
Java, C++,
Frameworks
Model = Code
www.modeliosoft.com 7
Typical example: Control system for a frigate
? 800+ components
? Developed by 100+ engineers
? 1M+ LOC
? MDD fosters Productivity and Quality with
o Code generation
o Components reuse
o Tracing
o Automation
www.modeliosoft.com 8
Curious DSL example: Ruby on Rails
Haml HTML
%br{:clear => left¨} <br clear= ̄left ̄/>
%p.foo Hello <p class= ̄foo ̄>Hello</p>
%p#foo Hello <p id= ̄foo ̄>Hello</p>
.foo <div class= ̄foo ̄>...</div>
#foo.bar <div id= ̄foo ̄ class= ̄bar ̄>...</div>
www.modeliosoft.com 9
Feature: User can manually add movie
Scenario: Add a movie
Given I am on the RottenPotatoes home page
When I follow "Add new movie"
Then I should be on the Create New Movie page
When I fill in "Title" with "Men In Black"
And I should see "Men In Black"
Cucumber
and Capybara
HAML
What do we get from MDD?
Pros
? Design once, deploy
everywhere!
? Write your
transformation once,
transform anything!
Cons
? Transformations are
hard to write´
? How to make sure they
are CORRECT? i.e.
C Is there any
data/semantic loss?
www.modeliosoft.com 10
BIG DATA
www.modeliosoft.com 11
Volume, variety, velocity
1. @-mails sent
every second : 2,9
million
2. Video uploaded to
YouTube every
minute: 25 hours
3. Data processed by
Google every day:
24 petabytes
4. Tweets per day:
50 million
5. Products ordered
on Amazon per
second: 73 items
www.modeliosoft.com 12
Only 0,5 % of data is analyzed
? In 2012, 2 837EB generated
- just 0,5% actually
analyzed.
That still amounts to 14EB
(or 14.185 million
terabytes)
Source: IDC & EMC
www.modeliosoft.com 13
SQL or Hadoop
www.modeliosoft.com 14
Do you have some data and a
problem to solve?
yes
Data fits in memory?
*Inspired by: Aaron Cordova
Data fits on single RAID array?
Tons of options. Don¨t
need database or
Hadoop
yes
no
no
yes
Solvable with SQL?
Use
MySQL
yes
Can you program?
Write a
prog.
yes
no
no
Dead
end
SQL or Hadoop (continued)
www.modeliosoft.com
15
* Inspired by: Aaron Cordova
Data fits on single RAID array?
no
Have lots of money?
Solvable with Oracle SQL?
no
Buy a SAN,
Use Oracle
yes
Do you have a PhD
in parallel prog. ?
no Roll your
own MPI
solution
yes
Solvable using
MapReduce?
no
yes
Can you program
MapReduce jobs?
Write
MapReduce on
Hadoop?
yes
no
Dead
end
no
Challenges
Hadoop MapReduce is the major trend
Success relies on personnel programming skills
Many problems are not solvable with Hadoop. Real-time?
MPI for high performance computing is an option when you have
a lots of money and a PhD
www.modeliosoft.com 16
www.modeliosoft.com 17
JUNIPER integrates Big Data technologies over MPI
www.modeliosoft.com 18
DOCs
StreamsDBs
Data Processing
Stage 1 Stage N
Business
Intelligence
Analytical
DBs
Visualization
dbdb
DOCsDOCs
Data Processing in JUNIPER
S1
S3
S2
Analytical
DBs
mpi
mpi
mpi
mpi
FPGA-enabled
nodes
Hadoop
HPC
Modelling in Juniper
www.modeliosoft.com 19
Models
High level
Architecture
(Nodes,Programs,
Streams´)
Real-time
constraints
Java
Code Code
Generation (+MPI initialization, communication, etc)
Reverse
Engineering
Schedulability
Analysis
Tool
(in progress)
Scheduling
Advisor
(in progress)
Measurements &
Advice
Deployment
Scripts
(in progress)
ConfigurationModel
Export
Code
Generation
Mapping Programming Model, UML and MARTE
www.modeliosoft.com 20
JUNIPER
Program
Channel
Cloud Node
Programming
Model
UML MARTE
Modelling the application and real-time constraints
www.modeliosoft.com 21
Real-time constrains
- response time
- bandwidth
Big Data flow
JUNIPER
Programs
Modelling the hardware infrastructure at a high level
www.modeliosoft.com 22
Cloud Node
CPU with 4 cores Hard drive
MPI code generation
www.modeliosoft.com 23
Code
Generation
JUNIPER
Application
Model
PETAFUEL CASE STUDY
www.modeliosoft.com 24
Risk: $45 million in half day
www.modeliosoft.com 25
Master Card debit card approval within 4 sec
www.modeliosoft.com 26
petaFuel -transactionapprovalwithin4 seconds
Transactions
Events
Stream
HistoricalData
30GB
Basic
Checks
FraudPatterns
Detection
Transaction
Approval
Decision
Juniper application model
www.modeliosoft.com 27
Deployment model
www.modeliosoft.com 28
MPI code generation
www.modeliosoft.com 29
public class EventProcessor {
public static final int RANK = 1;
public static IEventProcessor iEventProcessorImpl = new IEventProcessor() {
@Override
public void process(Event event) {
String key = getKeyFromTimestamp(event.getTimestamp());
String value = keyValueStoreIKeyValueStore.find(key);
if (value == null) {
keyValueStoreIKeyValueStore.put(key, "1");
} else {
int count = Integer.parseInt(value);
keyValueStoreIKeyValueStore.put(key, ""+(count+1);
}
}
´
};
´
public static void main(final String[] args) {
MPI.Init(args);
´
MPI.Finalize();
}
´
}
CONCLUSIONS
www.modeliosoft.com 30
Juniper trade-offs
www.modeliosoft.com 31
JUNIPER
Criteria Hadoop MPI
Communication HDFS - file system (httpd) HPC cluster interconnect (Infiniband)
Data flow Map Reduce Modeling + MPI comms
Parallelization Automatic Manually based on domain decomposition
Response time guaranties None Real-time for single node
Stages in multi-format No Any (incl. Hadoop + FPGA)
Hardware Commodity cluster HP cluster
Price  
Skills + ++++
Customers General audience Critical systems
Prospects C more work
Work in progress
UML based language
? MPI Communication
? Timing properties
? Deployment
petaFuel case study
Future work
Modelling payload
Integrating schedulability
Running final evaluations
Final release
www.modeliosoft.com 32
Questions?
Andrey Sadovykh
Marcos Almeida
SOFTEAM | ModelioSoft
{name.surname}@softeam.fr
SOFTEAM R&D Web Site:
http://rd.softeam.com
Modelio Web Site :
http://www.modelio.org
JUNIPER Web Site :
http://www.juniper-project.org
www.modeliosoft.com 33
*
*for your questions
Ad

Recommended

Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
DataWorks Summit/Hadoop Summit
?
EMC for V Mware Overview
EMC for V Mware Overview
mheiss
?
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
Big_Data_Heterogeneous_Programming IEEE_Big_Data 2015
Junli Gu
?
Network integration of heterogeneous data
Network integration of heterogeneous data
Lars Juhl Jensen
?
Mining heterogeneous data: Understanding systems at the level of complexes an...
Mining heterogeneous data: Understanding systems at the level of complexes an...
Lars Juhl Jensen
?
The use of Tin Can and Open Badges for learning
The use of Tin Can and Open Badges for learning
Epic
?
Integration of heterogeneous data
Integration of heterogeneous data
Lars Juhl Jensen
?
Integrating and Interpreting Social Data from Heterogeneous Sources
Integrating and Interpreting Social Data from Heterogeneous Sources
Matthew Rowe
?
Model driven engineering for big data management systems
Model driven engineering for big data management systems
Marcos Almeida
?
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt only
Arthur_Hansen
?
Multi datastores - CLOSER'14
Multi datastores - CLOSER'14
Marcos Almeida
?
OMG Introduction Dr. Richard Mark Soley
OMG Introduction Dr. Richard Mark Soley
CISQ - Consortium for IT Software Quality
?
Architecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
Adaryl "Bob" Wakefield, MBA
?
Big Data and OSS at IBM
Big Data and OSS at IBM
Boulder Java User's Group
?
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
?
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
Rajesh Jayarman
?
Building a Big Data Pipeline
Building a Big Data Pipeline
Jesus Rodriguez
?
Big Data Architecture
Big Data Architecture
Guido Schmutz
?
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
Big Data Week
?
MDA
MDA
Preetam Palwe
?
An overview of modern scalable web development
An overview of modern scalable web development
Tung Nguyen
?
ERP_Up_Down.ppt
ERP_Up_Down.ppt
KalsoomTahir2
?
Big Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
Institute of Contemporary Sciences
?
Big data trends challenges opportunities
Big data trends challenges opportunities
Mohammed Guller
?
PPT 1.1.2.pptx ehhllo hi hwi bdfhd dbdhu
PPT 1.1.2.pptx ehhllo hi hwi bdfhd dbdhu
bhushanshashi818
?
Business of Big Data
Business of Big Data
Leonid Zhukov
?
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
Uwe Printz
?
C19013010 the tutorial to build shared ai services session 2
C19013010 the tutorial to build shared ai services session 2
Bill Liu
?
Cadastral Maps
Cadastral Maps
Google
?
grade 9 science q1 quiz.pptx science quiz
grade 9 science q1 quiz.pptx science quiz
norfapangolima
?

More Related Content

Similar to JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis (20)

Model driven engineering for big data management systems
Model driven engineering for big data management systems
Marcos Almeida
?
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt only
Arthur_Hansen
?
Multi datastores - CLOSER'14
Multi datastores - CLOSER'14
Marcos Almeida
?
OMG Introduction Dr. Richard Mark Soley
OMG Introduction Dr. Richard Mark Soley
CISQ - Consortium for IT Software Quality
?
Architecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
Adaryl "Bob" Wakefield, MBA
?
Big Data and OSS at IBM
Big Data and OSS at IBM
Boulder Java User's Group
?
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
?
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
Rajesh Jayarman
?
Building a Big Data Pipeline
Building a Big Data Pipeline
Jesus Rodriguez
?
Big Data Architecture
Big Data Architecture
Guido Schmutz
?
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
Big Data Week
?
MDA
MDA
Preetam Palwe
?
An overview of modern scalable web development
An overview of modern scalable web development
Tung Nguyen
?
ERP_Up_Down.ppt
ERP_Up_Down.ppt
KalsoomTahir2
?
Big Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
Institute of Contemporary Sciences
?
Big data trends challenges opportunities
Big data trends challenges opportunities
Mohammed Guller
?
PPT 1.1.2.pptx ehhllo hi hwi bdfhd dbdhu
PPT 1.1.2.pptx ehhllo hi hwi bdfhd dbdhu
bhushanshashi818
?
Business of Big Data
Business of Big Data
Leonid Zhukov
?
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
Uwe Printz
?
C19013010 the tutorial to build shared ai services session 2
C19013010 the tutorial to build shared ai services session 2
Bill Liu
?
Model driven engineering for big data management systems
Model driven engineering for big data management systems
Marcos Almeida
?
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt only
Arthur_Hansen
?
Multi datastores - CLOSER'14
Multi datastores - CLOSER'14
Marcos Almeida
?
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
?
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
Rajesh Jayarman
?
Building a Big Data Pipeline
Building a Big Data Pipeline
Jesus Rodriguez
?
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
Big Data Week
?
An overview of modern scalable web development
An overview of modern scalable web development
Tung Nguyen
?
Big data trends challenges opportunities
Big data trends challenges opportunities
Mohammed Guller
?
PPT 1.1.2.pptx ehhllo hi hwi bdfhd dbdhu
PPT 1.1.2.pptx ehhllo hi hwi bdfhd dbdhu
bhushanshashi818
?
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
Uwe Printz
?
C19013010 the tutorial to build shared ai services session 2
C19013010 the tutorial to build shared ai services session 2
Bill Liu
?

Recently uploaded (20)

Cadastral Maps
Cadastral Maps
Google
?
grade 9 science q1 quiz.pptx science quiz
grade 9 science q1 quiz.pptx science quiz
norfapangolima
?
Water demand - Types , variations and WDS
Water demand - Types , variations and WDS
dhanashree78
?
Center Enamel can Provide Aluminum Dome Roofs for diesel tank.docx
Center Enamel can Provide Aluminum Dome Roofs for diesel tank.docx
CenterEnamel
?
COMPOSITE COLUMN IN STEEL CONCRETE COMPOSITES.ppt
COMPOSITE COLUMN IN STEEL CONCRETE COMPOSITES.ppt
ravicivil
?
David Boutry - Mentors Junior Developers
David Boutry - Mentors Junior Developers
David Boutry
?
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Alexandra N. Martinez
?
Deep Learning for Natural Language Processing_FDP on 16 June 2025 MITS.pptx
Deep Learning for Natural Language Processing_FDP on 16 June 2025 MITS.pptx
resming1
?
Pavement and its types, Application of rigid and Flexible Pavements
Pavement and its types, Application of rigid and Flexible Pavements
Sakthivel M
?
OCS Group SG - HPHT Well Design and Operation - SN.pdf
OCS Group SG - HPHT Well Design and Operation - SN.pdf
Muanisa Waras
?
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
djiceramil
?
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
djiceramil
?
02 - Ethics & Professionalism - BEM, IEM, MySET.PPT
02 - Ethics & Professionalism - BEM, IEM, MySET.PPT
SharinAbGhani1
?
machine learning is a advance technology
machine learning is a advance technology
ynancy893
?
Machine Learning - Classification Algorithms
Machine Learning - Classification Algorithms
resming1
?
chemistry investigatory project for class 12
chemistry investigatory project for class 12
Susis10
?
How Binning Affects LED Performance & Consistency.pdf
How Binning Affects LED Performance & Consistency.pdf
Mina Anis
?
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
gowthamvicky1
?
芙坪茶氏Y創_Chain of Thought .
芙坪茶氏Y創_Chain of Thought .
鰻粥京晦粥皆幄塀氏芙
?
WIRELESS COMMUNICATION SECURITY AND IT¨S PROTECTION METHODS
WIRELESS COMMUNICATION SECURITY AND IT¨S PROTECTION METHODS
samueljackson3773
?
Cadastral Maps
Cadastral Maps
Google
?
grade 9 science q1 quiz.pptx science quiz
grade 9 science q1 quiz.pptx science quiz
norfapangolima
?
Water demand - Types , variations and WDS
Water demand - Types , variations and WDS
dhanashree78
?
Center Enamel can Provide Aluminum Dome Roofs for diesel tank.docx
Center Enamel can Provide Aluminum Dome Roofs for diesel tank.docx
CenterEnamel
?
COMPOSITE COLUMN IN STEEL CONCRETE COMPOSITES.ppt
COMPOSITE COLUMN IN STEEL CONCRETE COMPOSITES.ppt
ravicivil
?
David Boutry - Mentors Junior Developers
David Boutry - Mentors Junior Developers
David Boutry
?
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Alexandra N. Martinez
?
Deep Learning for Natural Language Processing_FDP on 16 June 2025 MITS.pptx
Deep Learning for Natural Language Processing_FDP on 16 June 2025 MITS.pptx
resming1
?
Pavement and its types, Application of rigid and Flexible Pavements
Pavement and its types, Application of rigid and Flexible Pavements
Sakthivel M
?
OCS Group SG - HPHT Well Design and Operation - SN.pdf
OCS Group SG - HPHT Well Design and Operation - SN.pdf
Muanisa Waras
?
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
djiceramil
?
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
362 Alec Data Center Solutions-Slysium Data Center-AUH-Adaptaflex.pdf
djiceramil
?
02 - Ethics & Professionalism - BEM, IEM, MySET.PPT
02 - Ethics & Professionalism - BEM, IEM, MySET.PPT
SharinAbGhani1
?
machine learning is a advance technology
machine learning is a advance technology
ynancy893
?
Machine Learning - Classification Algorithms
Machine Learning - Classification Algorithms
resming1
?
chemistry investigatory project for class 12
chemistry investigatory project for class 12
Susis10
?
How Binning Affects LED Performance & Consistency.pdf
How Binning Affects LED Performance & Consistency.pdf
Mina Anis
?
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
gowthamvicky1
?
WIRELESS COMMUNICATION SECURITY AND IT¨S PROTECTION METHODS
WIRELESS COMMUNICATION SECURITY AND IT¨S PROTECTION METHODS
samueljackson3773
?
Ad

JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis

  • 1. www.modeliosoft.com Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis Andrey.Sadovykh@softeam.fr 1
  • 3. 20 ME 2006 17,5 ME 2005 70 ME 2013 Paris Rennes Nantes Sophia SOFTEAM C a French IT services / Software vendor ?SOFTEAM, a growing company ? 25 years¨ experience ? 850 experts ? Regular growth ? Specialist in OO technologies, new architectures, methodologies ? Banking, Defense, Telecom, ´ www.modeliosoft.com 3 23 ME 2008
  • 4. Modelio for Software and System Engineering ? UML editor with 20 years¨ history o CloudML o SysML o MARTE o Code generation o Documentation o Teamwork www.modeliosoft.com 4 ? Available under open source at Modelio.org
  • 6. It is all about models ´ Starting with UML www.modeliosoft.com 6 Requirements UML Use Cases Architecture UML Components and Classes Design Refined Classes or Domain Specific Language Implementation Code generation Java, C++, Frameworks
  • 8. Typical example: Control system for a frigate ? 800+ components ? Developed by 100+ engineers ? 1M+ LOC ? MDD fosters Productivity and Quality with o Code generation o Components reuse o Tracing o Automation www.modeliosoft.com 8
  • 9. Curious DSL example: Ruby on Rails Haml HTML %br{:clear => left¨} <br clear= ̄left ̄/> %p.foo Hello <p class= ̄foo ̄>Hello</p> %p#foo Hello <p id= ̄foo ̄>Hello</p> .foo <div class= ̄foo ̄>...</div> #foo.bar <div id= ̄foo ̄ class= ̄bar ̄>...</div> www.modeliosoft.com 9 Feature: User can manually add movie Scenario: Add a movie Given I am on the RottenPotatoes home page When I follow "Add new movie" Then I should be on the Create New Movie page When I fill in "Title" with "Men In Black" And I should see "Men In Black" Cucumber and Capybara HAML
  • 10. What do we get from MDD? Pros ? Design once, deploy everywhere! ? Write your transformation once, transform anything! Cons ? Transformations are hard to write´ ? How to make sure they are CORRECT? i.e. C Is there any data/semantic loss? www.modeliosoft.com 10
  • 12. Volume, variety, velocity 1. @-mails sent every second : 2,9 million 2. Video uploaded to YouTube every minute: 25 hours 3. Data processed by Google every day: 24 petabytes 4. Tweets per day: 50 million 5. Products ordered on Amazon per second: 73 items www.modeliosoft.com 12
  • 13. Only 0,5 % of data is analyzed ? In 2012, 2 837EB generated - just 0,5% actually analyzed. That still amounts to 14EB (or 14.185 million terabytes) Source: IDC & EMC www.modeliosoft.com 13
  • 14. SQL or Hadoop www.modeliosoft.com 14 Do you have some data and a problem to solve? yes Data fits in memory? *Inspired by: Aaron Cordova Data fits on single RAID array? Tons of options. Don¨t need database or Hadoop yes no no yes Solvable with SQL? Use MySQL yes Can you program? Write a prog. yes no no Dead end
  • 15. SQL or Hadoop (continued) www.modeliosoft.com 15 * Inspired by: Aaron Cordova Data fits on single RAID array? no Have lots of money? Solvable with Oracle SQL? no Buy a SAN, Use Oracle yes Do you have a PhD in parallel prog. ? no Roll your own MPI solution yes Solvable using MapReduce? no yes Can you program MapReduce jobs? Write MapReduce on Hadoop? yes no Dead end no
  • 16. Challenges Hadoop MapReduce is the major trend Success relies on personnel programming skills Many problems are not solvable with Hadoop. Real-time? MPI for high performance computing is an option when you have a lots of money and a PhD www.modeliosoft.com 16
  • 18. JUNIPER integrates Big Data technologies over MPI www.modeliosoft.com 18 DOCs StreamsDBs Data Processing Stage 1 Stage N Business Intelligence Analytical DBs Visualization dbdb DOCsDOCs Data Processing in JUNIPER S1 S3 S2 Analytical DBs mpi mpi mpi mpi FPGA-enabled nodes Hadoop HPC
  • 19. Modelling in Juniper www.modeliosoft.com 19 Models High level Architecture (Nodes,Programs, Streams´) Real-time constraints Java Code Code Generation (+MPI initialization, communication, etc) Reverse Engineering Schedulability Analysis Tool (in progress) Scheduling Advisor (in progress) Measurements & Advice Deployment Scripts (in progress) ConfigurationModel Export Code Generation
  • 20. Mapping Programming Model, UML and MARTE www.modeliosoft.com 20 JUNIPER Program Channel Cloud Node Programming Model UML MARTE
  • 21. Modelling the application and real-time constraints www.modeliosoft.com 21 Real-time constrains - response time - bandwidth Big Data flow JUNIPER Programs
  • 22. Modelling the hardware infrastructure at a high level www.modeliosoft.com 22 Cloud Node CPU with 4 cores Hard drive
  • 23. MPI code generation www.modeliosoft.com 23 Code Generation JUNIPER Application Model
  • 25. Risk: $45 million in half day www.modeliosoft.com 25
  • 26. Master Card debit card approval within 4 sec www.modeliosoft.com 26 petaFuel -transactionapprovalwithin4 seconds Transactions Events Stream HistoricalData 30GB Basic Checks FraudPatterns Detection Transaction Approval Decision
  • 29. MPI code generation www.modeliosoft.com 29 public class EventProcessor { public static final int RANK = 1; public static IEventProcessor iEventProcessorImpl = new IEventProcessor() { @Override public void process(Event event) { String key = getKeyFromTimestamp(event.getTimestamp()); String value = keyValueStoreIKeyValueStore.find(key); if (value == null) { keyValueStoreIKeyValueStore.put(key, "1"); } else { int count = Integer.parseInt(value); keyValueStoreIKeyValueStore.put(key, ""+(count+1); } } ´ }; ´ public static void main(final String[] args) { MPI.Init(args); ´ MPI.Finalize(); } ´ }
  • 31. Juniper trade-offs www.modeliosoft.com 31 JUNIPER Criteria Hadoop MPI Communication HDFS - file system (httpd) HPC cluster interconnect (Infiniband) Data flow Map Reduce Modeling + MPI comms Parallelization Automatic Manually based on domain decomposition Response time guaranties None Real-time for single node Stages in multi-format No Any (incl. Hadoop + FPGA) Hardware Commodity cluster HP cluster Price Skills + ++++ Customers General audience Critical systems
  • 32. Prospects C more work Work in progress UML based language ? MPI Communication ? Timing properties ? Deployment petaFuel case study Future work Modelling payload Integrating schedulability Running final evaluations Final release www.modeliosoft.com 32
  • 33. Questions? Andrey Sadovykh Marcos Almeida SOFTEAM | ModelioSoft {name.surname}@softeam.fr SOFTEAM R&D Web Site: http://rd.softeam.com Modelio Web Site : http://www.modelio.org JUNIPER Web Site : http://www.juniper-project.org www.modeliosoft.com 33 * *for your questions