際際滷

際際滷Share a Scribd company logo
www.kensu.io
INTERACTIVE NOTEBOOKS
1
What, Why and How
www.kensu.io
INTERACTIVE NOTEBOOKS
2
What, Why and How
image ref: https://medium.com/joytunes/the-good-the-bad-and-the-ugly-in-mobile-app-subscriptions-fe6b8c0e8b18
www.kensu.io
ANDY -|- KENSU
3
Andy Petrella - Founder @ Kensu
Maths MSc / Computer Science MSc
10+ years in data computing (science?)
http://kensu.io Analytics, AI Governance
3
Analytics
Governance
Perform
ance
Compliance
www.kensu.io
* engineers, scientists et al.
a. What
b. Which
c. Pros & Cons
I. NOTEBOOKS FOR DATA CITIZENS*
x. My2蔵
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
i. What is working on Data"
ii. Power of Interactivity
iii. Centralisation and Share-ability
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
i. What is working on Data"
1. elaborate a business opportunity plan or hypothesis
2. re鍖ne the goals with the help of business team
3. discover available data source potentially interesting
4. connect to the data source (or copy)
5. explore the data source content to get to know it better
6. create 鍖rst models
7. decide if results are good to create the (data) product
8. if not,
1. decide if data is worth keeping or enough
2. back to 2
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
ii. Power of Interactivity
A data project (decisional project) goes along with anxiety.
The time to 鍖rst results is rather long due to complexity.
The complexity can be due to:
- the data,
- the availability of data
- the environment,
- the business,
- the security,
-
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
ii. Power of Interactivity
And the 鍖rst results wont (highly) probably be good!
- sense of lack of visibility
- problem of communication
- failure
If these projects are considered as IT projects, it leads to
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
ii. Power of Interactivity
The need of frequently making tries and errors resulted in
1. explosion of dynamic languages (rather than C for instance)
2. and interpreted languages
Like Python, R !!
Leading to data projects mostly driven from shell exploration
 and released as scripts
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
ii. Power of Interactivity
The BI tools alternative is still valid.
However too constrictive to unleash the power of data science.
However Shell and script are awful tool for programming:
- line by line editing
- not persisted
- not shareable
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
iii. Centralisation and Share-ability
1. Web based
2. Direct results (incremental context)
3. Shareable (e.g. JSON)
To 鍖x these problems, the community created notebooks.
Well notebooks alike already existed however (e.g. matlab)
Notebooks implementations started with IPython
(and are following the same rules)
www.kensu.io
A. NOTEBOOKS: WHAT ARE THEY
iii. Centralisation and Share-ability
1. access to data directly and run experiments
2. be installed as a service and centralise security
3. can be shared (well easily compared to shell script)
Notebooks can
www.kensu.io
B. NOTEBOOKS: WHICH ONES
i. Jupyter
http://jupyter.org/
ii. Apache Zeppelin
https://zeppelin.apache.org/
iii. Spark Notebook
http://spark-notebook.io/
iv. RStudio
https://www.rstudio.com/
v. (proprietary) Databricks
https://bit.ly/2U1xPlw
www.kensu.io
C. NOTEBOOKS: PROS
i. Interactivity
ii. Centralised
iii. Mix code and documentation
iv. Communication (IT <-> Data Folks)
v. BI Tool alternative
www.kensu.io
C. NOTEBOOKS: CONS
i. Security backdoor
ii. Highly dynamic, no traceability
iii. No/poor versioning
iv. Non-linear (code)
v. Non modular
vi. Poor production-readiness
www.kensu.io
X. NOTEBOOKS: MY2蔵
i. Why do I have created Spark Notebook
ii. Pick yours
 Zeppelin for data engineers,
 Jupyter for data scientists
 RStudio for R folks
 Spark Notebook for Scala and/or Spark
www.kensu.io
a. Risks
b. Side effects
II. NOTEBOOKS IN THE ENTERPRISE
www.kensu.io
A. NOTEBOOKS: RISKS
i. Governance
ii. Compliance
 Data usage changes
 Data Governance needs monitoring
www.kensu.io
B. NOTEBOOKS: SIDE EFFECTS
i. From Monitoring to Management
ii. Data Science Lifecycle
iii. New Possibilities
www.kensu.io
THANKS!
http://kensu.io Analytics, AI Governance
Analytics
Governance
Perform
ance
Compliance
Q/A
Checkout Kensu Data Activity Manager

More Related Content

Interactive notebooks

  • 2. www.kensu.io INTERACTIVE NOTEBOOKS 2 What, Why and How image ref: https://medium.com/joytunes/the-good-the-bad-and-the-ugly-in-mobile-app-subscriptions-fe6b8c0e8b18
  • 3. www.kensu.io ANDY -|- KENSU 3 Andy Petrella - Founder @ Kensu Maths MSc / Computer Science MSc 10+ years in data computing (science?) http://kensu.io Analytics, AI Governance 3 Analytics Governance Perform ance Compliance
  • 4. www.kensu.io * engineers, scientists et al. a. What b. Which c. Pros & Cons I. NOTEBOOKS FOR DATA CITIZENS* x. My2蔵
  • 5. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY i. What is working on Data" ii. Power of Interactivity iii. Centralisation and Share-ability
  • 6. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY i. What is working on Data" 1. elaborate a business opportunity plan or hypothesis 2. re鍖ne the goals with the help of business team 3. discover available data source potentially interesting 4. connect to the data source (or copy) 5. explore the data source content to get to know it better 6. create 鍖rst models 7. decide if results are good to create the (data) product 8. if not, 1. decide if data is worth keeping or enough 2. back to 2
  • 7. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY ii. Power of Interactivity A data project (decisional project) goes along with anxiety. The time to 鍖rst results is rather long due to complexity. The complexity can be due to: - the data, - the availability of data - the environment, - the business, - the security, -
  • 8. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY ii. Power of Interactivity And the 鍖rst results wont (highly) probably be good! - sense of lack of visibility - problem of communication - failure If these projects are considered as IT projects, it leads to
  • 9. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY ii. Power of Interactivity The need of frequently making tries and errors resulted in 1. explosion of dynamic languages (rather than C for instance) 2. and interpreted languages Like Python, R !! Leading to data projects mostly driven from shell exploration and released as scripts
  • 10. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY ii. Power of Interactivity The BI tools alternative is still valid. However too constrictive to unleash the power of data science. However Shell and script are awful tool for programming: - line by line editing - not persisted - not shareable
  • 11. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY iii. Centralisation and Share-ability 1. Web based 2. Direct results (incremental context) 3. Shareable (e.g. JSON) To 鍖x these problems, the community created notebooks. Well notebooks alike already existed however (e.g. matlab) Notebooks implementations started with IPython (and are following the same rules)
  • 12. www.kensu.io A. NOTEBOOKS: WHAT ARE THEY iii. Centralisation and Share-ability 1. access to data directly and run experiments 2. be installed as a service and centralise security 3. can be shared (well easily compared to shell script) Notebooks can
  • 13. www.kensu.io B. NOTEBOOKS: WHICH ONES i. Jupyter http://jupyter.org/ ii. Apache Zeppelin https://zeppelin.apache.org/ iii. Spark Notebook http://spark-notebook.io/ iv. RStudio https://www.rstudio.com/ v. (proprietary) Databricks https://bit.ly/2U1xPlw
  • 14. www.kensu.io C. NOTEBOOKS: PROS i. Interactivity ii. Centralised iii. Mix code and documentation iv. Communication (IT <-> Data Folks) v. BI Tool alternative
  • 15. www.kensu.io C. NOTEBOOKS: CONS i. Security backdoor ii. Highly dynamic, no traceability iii. No/poor versioning iv. Non-linear (code) v. Non modular vi. Poor production-readiness
  • 16. www.kensu.io X. NOTEBOOKS: MY2蔵 i. Why do I have created Spark Notebook ii. Pick yours Zeppelin for data engineers, Jupyter for data scientists RStudio for R folks Spark Notebook for Scala and/or Spark
  • 17. www.kensu.io a. Risks b. Side effects II. NOTEBOOKS IN THE ENTERPRISE
  • 18. www.kensu.io A. NOTEBOOKS: RISKS i. Governance ii. Compliance Data usage changes Data Governance needs monitoring
  • 19. www.kensu.io B. NOTEBOOKS: SIDE EFFECTS i. From Monitoring to Management ii. Data Science Lifecycle iii. New Possibilities
  • 20. www.kensu.io THANKS! http://kensu.io Analytics, AI Governance Analytics Governance Perform ance Compliance Q/A Checkout Kensu Data Activity Manager