狠狠撸

狠狠撸Share a Scribd company logo
Introduction to Zeppelin
Lee moon soo, NFLabs
moon@apache.org
me
? Lee moon soo, aka. ‘moon’
? moon@apache.org Committer and PPMC of Zeppelin
? moon@n?abs.com Co-founder of NFLabs
Agenda
? Why do you like Zeppelin?
? Why your project likes Zeppelin?
? Roadmap
? QnA
What is Zeppelin?
Let’s see demo
How Zeppelin started
What gets measured, gets managed
- Peter Drucker
A price of light is less than the cost of darkness
- Arthur C. Nielsen , Founder of ACNielsen
War is ninety percent information
- Napoleon Bonaparate
In God we trust, all others must bring data
- W. Edwards Deming
Data = Understanding
For data analysis, we need tool
But couldn’t ?nd one i like
ImpalaDrill
Hive tajoPig
Cloudera-ML
MLLib
MRQL
Decided to make one
Really good one
Good analytics environment
Analytical language
Many Libraries
Interactive
Visualization
Sharing
The First attempt 2012~2013
It’s got graphic REPL, deployment,
search, import tool
But failed, because
It wasn’t widely used
It wasn’t opensource
Second attempt 2013~2014
Opensourced
graphic REPL
from commercial
product
The ?rst version
of Zeppelin
Second attempt 2013~2014
Not widely used
It was slow,
dif?cult to use,
…
Third attempt 2014~
After few weeks of study,
decided to rewrite Zeppelin
Graphic REPl -> Notebook
with Apache Spark
integration
Third attempt 2014~
Next week, beautifulized
Why do you like Zeppelin?
Web
Based web framework
d3.js
Visualization
Language
Package management
bower
Build
Notebook
Code
Result
Code
Result
??? ?? ????? ??? ??? ????
???? ???? ???
Notebook
Notebook
Data
Visualize
Pivot
Pivot
Dynamic Form Creation
Dynamic
Form
REST
Web socket
Synchronized
Sharing
Zeppelin simpli?es data
analysis
Why do your project likes
Zeppelin?
….
Interpreters
Spark
PySpark
SparkSQL
Hive
Mysql (JDBC)
Markdown
Shell
Easy to
extend
Zeppelin Interpreter Architecture
Classloader
InterpreterGroup
Interpreter Interpreter
Server
Client
…
HTTP Rest / Websocket
Classloader
InterpreterGroup
Spark SparkSQL Dep
Classloader
InterpreterGroup
Interpreter Interpreter
Server
Client
…
HTTP Rest / Websocket
InterpreterGroup
Interpreter Interpreter …
Seprate JVM process
Thrift
Zeppelin Interpreter Architecture
public abstract void open();
public abstract void close();
public abstract InterpreterResult interpret(String st, InterpreterContext context);
public abstract void cancel(InterpreterContext context);
public abstract int getProgress(InterpreterContext context);
public abstract List<String> completion(String buf, int cursor);
public abstract FormType getFormType();
public Scheduler getScheduler();
Implementing new Interpreter
Must
have
Good to
have
More
controls
Roadmap
? Integration with more distributed
processing framework
? Flink, Ignite, Tajo, etc..
? Output message streaming
? Ability to create rich GUI
ImpalaDrill
Hive tajoPig
Cloudera-ML
MLLib
MRQL
QnA
Thanks
Lee moon soo
moon@apache.org
moon@n?abs.com

More Related Content

Apache Zeppelin ??