狠狠撸

狠狠撸Share a Scribd company logo
Tiezheng Li
Tiezheng.Li@twosigma.com
Two Sigma Investments, LLC
April 27, 2018
The Beaker Extensions for Jupyter:
Agenda
Beaker Notebook
From Beaker Notebook to BeakerX
BeakerX Live Demo
DISCOVERABLE
DATA
DATA
ANALYSIS
+ MODELING
SCALABLE
+ DISTRIBUTED COMPUTE
PUBLICATION
+ COLLABORATION
OUR VISION FOR DATA SCIENCE
BEAKER,
AN INTRODUCTION
LANGUAGE
MATTERS
BeakerX - Tiezheng Li
BeakerX - Tiezheng Li
Oct 2013
Internal GA
Mar 2015
R, Scala, Java, Python2/3 support
Jun 2015
PySpark, SparkR, Clojure, Kdb
support
Nov 2016
BeakerX Pivot
Apr 2016
External Beaker Lab Alpha
LIFE OF BEAKER
May 2014
Open Source Beta
Aug 2017
BeakerX RC1
OPEN SOURCE WORLD
nbconvert
nbviewer
nbpresent
nbgrader
Jupyter Hub nbdime
nbmanager
binder
Jupyter Lab
FORK? MERGE? JOIN?
THE PIVOT
BeakerX - Tiezheng Li
WE DID IT!
94%
1463
213
● Time Series Visualizations
● JVM Kernels
● Interactive Tables
● Collaborative Publication
● True Polyglot Analysis (in progress)
● Data Discovery (in progress)
BeakerX: A unique addition to the Jupyter Ecosystem
DEMO
Future Work:
● Migration to Jupyter Lab
● Spark deep integration
● True Polyglot Analysis
● Data Discovery
● and more … !
THANK YOU
Ad

Recommended

BeakerX Beaker Extensions for Jupyter
BeakerX Beaker Extensions for Jupyter
PyData
?
Productionize spark structured streaming
Productionize spark structured streaming
Ivan Kosianenko
?
The future of Data on Kubernetes
The future of Data on Kubernetes
DoKC
?
Testing and Monitoring and Broken Things | Nikki Attea | Sensu
Testing and Monitoring and Broken Things | Nikki Attea | Sensu
InfluxData
?
The Power of GitOps with Flux & GitOps Toolkit
The Power of GitOps with Flux & GitOps Toolkit
Weaveworks
?
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
Scality
?
How to Streamline Incident Response with InfluxDB, PagerDuty and Rundeck
How to Streamline Incident Response with InfluxDB, PagerDuty and Rundeck
InfluxData
?
Integrating Google APIs into Your Applications
Integrating Google APIs into Your Applications
Chris Schalk
?
Lessons Learned: Spring Cloud -> Docker -> Kubernetes
Lessons Learned: Spring Cloud -> Docker -> Kubernetes
Mauricio (Salaboy) Salatino
?
Seventh openCypher Implementers Group Meeting: Status Update
Seventh openCypher Implementers Group Meeting: Status Update
openCypher
?
PixieDust
PixieDust
Margriet Groenendijk
?
This week in Neo4j -13th January 2018
This week in Neo4j -13th January 2018
Neo4j
?
OSMC 2017 | Ops and dev stories- Integrate everything into your monitoring st...
OSMC 2017 | Ops and dev stories- Integrate everything into your monitoring st...
NETWAYS
?
Big Data Analytics London - Data Science in the Cloud
Big Data Analytics London - Data Science in the Cloud
Margriet Groenendijk
?
Kubernetes Config Management Landscape
Kubernetes Config Management Landscape
Tomasz Tarczyński
?
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
InfluxData
?
The Property Graph Query Language Landscape: openCypher and Property Graph Ex...
The Property Graph Query Language Landscape: openCypher and Property Graph Ex...
openCypher
?
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
aspyker
?
Spark + i python
Spark + i python
Guillermo Blasco Jiménez
?
Full Stack Development with Neo4j and GraphQL
Full Stack Development with Neo4j and GraphQL
Neo4j
?
Building Community APIs using GraphQL, Neo4j, and Kotlin
Building Community APIs using GraphQL, Neo4j, and Kotlin
Neo4j
?
This week in Neo4j - 7th October 2017
This week in Neo4j - 7th October 2017
Neo4j
?
Lessons Learned: From Java EE to Spring Cloud in the context of Activiti OSS
Lessons Learned: From Java EE to Spring Cloud in the context of Activiti OSS
Mauricio (Salaboy) Salatino
?
Cloud architectures for data science
Cloud architectures for data science
Margriet Groenendijk
?
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network Story
Elasticsearch
?
SW360 Update Tooling Telco
SW360 Update Tooling Telco
Shane Coughlan
?
12th Meeting OpenChain Reference Tooling Work Group - 25th March - 狠狠撸s
12th Meeting OpenChain Reference Tooling Work Group - 25th March - 狠狠撸s
Shane Coughlan
?
The State of Open Data on School Bullying
The State of Open Data on School Bullying
Two Sigma
?
Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018
Two Sigma
?
Future of Pandas - Jeff Reback
Future of Pandas - Jeff Reback
Two Sigma
?

More Related Content

What's hot (19)

Lessons Learned: Spring Cloud -> Docker -> Kubernetes
Lessons Learned: Spring Cloud -> Docker -> Kubernetes
Mauricio (Salaboy) Salatino
?
Seventh openCypher Implementers Group Meeting: Status Update
Seventh openCypher Implementers Group Meeting: Status Update
openCypher
?
PixieDust
PixieDust
Margriet Groenendijk
?
This week in Neo4j -13th January 2018
This week in Neo4j -13th January 2018
Neo4j
?
OSMC 2017 | Ops and dev stories- Integrate everything into your monitoring st...
OSMC 2017 | Ops and dev stories- Integrate everything into your monitoring st...
NETWAYS
?
Big Data Analytics London - Data Science in the Cloud
Big Data Analytics London - Data Science in the Cloud
Margriet Groenendijk
?
Kubernetes Config Management Landscape
Kubernetes Config Management Landscape
Tomasz Tarczyński
?
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
InfluxData
?
The Property Graph Query Language Landscape: openCypher and Property Graph Ex...
The Property Graph Query Language Landscape: openCypher and Property Graph Ex...
openCypher
?
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
aspyker
?
Spark + i python
Spark + i python
Guillermo Blasco Jiménez
?
Full Stack Development with Neo4j and GraphQL
Full Stack Development with Neo4j and GraphQL
Neo4j
?
Building Community APIs using GraphQL, Neo4j, and Kotlin
Building Community APIs using GraphQL, Neo4j, and Kotlin
Neo4j
?
This week in Neo4j - 7th October 2017
This week in Neo4j - 7th October 2017
Neo4j
?
Lessons Learned: From Java EE to Spring Cloud in the context of Activiti OSS
Lessons Learned: From Java EE to Spring Cloud in the context of Activiti OSS
Mauricio (Salaboy) Salatino
?
Cloud architectures for data science
Cloud architectures for data science
Margriet Groenendijk
?
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network Story
Elasticsearch
?
SW360 Update Tooling Telco
SW360 Update Tooling Telco
Shane Coughlan
?
12th Meeting OpenChain Reference Tooling Work Group - 25th March - 狠狠撸s
12th Meeting OpenChain Reference Tooling Work Group - 25th March - 狠狠撸s
Shane Coughlan
?
Lessons Learned: Spring Cloud -> Docker -> Kubernetes
Lessons Learned: Spring Cloud -> Docker -> Kubernetes
Mauricio (Salaboy) Salatino
?
Seventh openCypher Implementers Group Meeting: Status Update
Seventh openCypher Implementers Group Meeting: Status Update
openCypher
?
This week in Neo4j -13th January 2018
This week in Neo4j -13th January 2018
Neo4j
?
OSMC 2017 | Ops and dev stories- Integrate everything into your monitoring st...
OSMC 2017 | Ops and dev stories- Integrate everything into your monitoring st...
NETWAYS
?
Big Data Analytics London - Data Science in the Cloud
Big Data Analytics London - Data Science in the Cloud
Margriet Groenendijk
?
Kubernetes Config Management Landscape
Kubernetes Config Management Landscape
Tomasz Tarczyński
?
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
InfluxData
?
The Property Graph Query Language Landscape: openCypher and Property Graph Ex...
The Property Graph Query Language Landscape: openCypher and Property Graph Ex...
openCypher
?
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
aspyker
?
Full Stack Development with Neo4j and GraphQL
Full Stack Development with Neo4j and GraphQL
Neo4j
?
Building Community APIs using GraphQL, Neo4j, and Kotlin
Building Community APIs using GraphQL, Neo4j, and Kotlin
Neo4j
?
This week in Neo4j - 7th October 2017
This week in Neo4j - 7th October 2017
Neo4j
?
Lessons Learned: From Java EE to Spring Cloud in the context of Activiti OSS
Lessons Learned: From Java EE to Spring Cloud in the context of Activiti OSS
Mauricio (Salaboy) Salatino
?
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network Story
Elasticsearch
?
SW360 Update Tooling Telco
SW360 Update Tooling Telco
Shane Coughlan
?
12th Meeting OpenChain Reference Tooling Work Group - 25th March - 狠狠撸s
12th Meeting OpenChain Reference Tooling Work Group - 25th March - 狠狠撸s
Shane Coughlan
?

More from Two Sigma (20)

The State of Open Data on School Bullying
The State of Open Data on School Bullying
Two Sigma
?
Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018
Two Sigma
?
Future of Pandas - Jeff Reback
Future of Pandas - Jeff Reback
Two Sigma
?
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
Two Sigma
?
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Two Sigma
?
Waiter: An Open-Source Distributed Auto-Scaler
Waiter: An Open-Source Distributed Auto-Scaler
Two Sigma
?
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Two Sigma
?
Archival Storage at Two Sigma - Josh Leners
Archival Storage at Two Sigma - Josh Leners
Two Sigma
?
Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...
Two Sigma
?
The Language of Compression - Leif Walsh
The Language of Compression - Leif Walsh
Two Sigma
?
Identifying Emergent Behaviors in Complex Systems - Jane Adams
Identifying Emergent Behaviors in Complex Systems - Jane Adams
Two Sigma
?
Algorithmic Data Science = Theory + Practice
Algorithmic Data Science = Theory + Practice
Two Sigma
?
HUOHUA: A Distributed Time Series Analysis Framework For Spark
HUOHUA: A Distributed Time Series Analysis Framework For Spark
Two Sigma
?
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
Two Sigma
?
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
Two Sigma
?
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Two Sigma
?
Graph Summarization with Quality Guarantees
Graph Summarization with Quality Guarantees
Two Sigma
?
Rademacher Averages: Theory and Practice
Rademacher Averages: Theory and Practice
Two Sigma
?
Credit-Implied Volatility
Credit-Implied Volatility
Two Sigma
?
Principles of REST API Design
Principles of REST API Design
Two Sigma
?
The State of Open Data on School Bullying
The State of Open Data on School Bullying
Two Sigma
?
Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018
Two Sigma
?
Future of Pandas - Jeff Reback
Future of Pandas - Jeff Reback
Two Sigma
?
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
Two Sigma
?
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Two Sigma
?
Waiter: An Open-Source Distributed Auto-Scaler
Waiter: An Open-Source Distributed Auto-Scaler
Two Sigma
?
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Two Sigma
?
Archival Storage at Two Sigma - Josh Leners
Archival Storage at Two Sigma - Josh Leners
Two Sigma
?
Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...
Two Sigma
?
The Language of Compression - Leif Walsh
The Language of Compression - Leif Walsh
Two Sigma
?
Identifying Emergent Behaviors in Complex Systems - Jane Adams
Identifying Emergent Behaviors in Complex Systems - Jane Adams
Two Sigma
?
Algorithmic Data Science = Theory + Practice
Algorithmic Data Science = Theory + Practice
Two Sigma
?
HUOHUA: A Distributed Time Series Analysis Framework For Spark
HUOHUA: A Distributed Time Series Analysis Framework For Spark
Two Sigma
?
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
Two Sigma
?
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
Two Sigma
?
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Two Sigma
?
Graph Summarization with Quality Guarantees
Graph Summarization with Quality Guarantees
Two Sigma
?
Rademacher Averages: Theory and Practice
Rademacher Averages: Theory and Practice
Two Sigma
?
Credit-Implied Volatility
Credit-Implied Volatility
Two Sigma
?
Principles of REST API Design
Principles of REST API Design
Two Sigma
?
Ad

Recently uploaded (20)

All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
Introduction for GenAI for Faculty for University.pdf
Introduction for GenAI for Faculty for University.pdf
Saeed999312
?
25 items quiz for practical research 1 in grade 11
25 items quiz for practical research 1 in grade 11
leamaydayaganon81
?
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
taqyea
?
FME Beyond Data Processing: Creating a Dartboard Accuracy App
FME Beyond Data Processing: Creating a Dartboard Accuracy App
jacoba18
?
lecture12.pdf Introduction to bioinformatics
lecture12.pdf Introduction to bioinformatics
SergeyTsygankov6
?
Data Warehousing and Analytics IFI Techsolutions .pptx
Data Warehousing and Analytics IFI Techsolutions .pptx
IFI Techsolutions
?
SUNSSE Engineering Introduction 2021.pdf
SUNSSE Engineering Introduction 2021.pdf
Ongkino
?
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Ameya Patekar
?
Presentation by Tariq & Mohammed (1).pptx
Presentation by Tariq & Mohammed (1).pptx
AbooddSandoqaa
?
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
?
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
?
MCB Internship report for the year of 2025
MCB Internship report for the year of 2025
PakistanPrinting
?
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
benediktnetzer1
?
THE LINEAR REGRESSION MODEL: AN OVERVIEW
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Ameya Patekar
?
Camuflaje Tipos Características Militar 2025.ppt
Camuflaje Tipos Características Militar 2025.ppt
e58650738
?
@Reset-Password.pptx presentakh;kenvtion
@Reset-Password.pptx presentakh;kenvtion
MarkLariosa1
?
deep_learning_presentation related to llm
deep_learning_presentation related to llm
sayedabdussalam11
?
Top network design for infrastructure for it
Top network design for infrastructure for it
GUESH8
?
Communication_Skills_Class10_Visual.pptx
Communication_Skills_Class10_Visual.pptx
namanrastogi70555
?
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
Introduction for GenAI for Faculty for University.pdf
Introduction for GenAI for Faculty for University.pdf
Saeed999312
?
25 items quiz for practical research 1 in grade 11
25 items quiz for practical research 1 in grade 11
leamaydayaganon81
?
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
最新版美国芝加哥大学毕业证(鲍颁丑颈肠补驳辞毕业证书)原版定制
taqyea
?
FME Beyond Data Processing: Creating a Dartboard Accuracy App
FME Beyond Data Processing: Creating a Dartboard Accuracy App
jacoba18
?
lecture12.pdf Introduction to bioinformatics
lecture12.pdf Introduction to bioinformatics
SergeyTsygankov6
?
Data Warehousing and Analytics IFI Techsolutions .pptx
Data Warehousing and Analytics IFI Techsolutions .pptx
IFI Techsolutions
?
SUNSSE Engineering Introduction 2021.pdf
SUNSSE Engineering Introduction 2021.pdf
Ongkino
?
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Ameya Patekar
?
Presentation by Tariq & Mohammed (1).pptx
Presentation by Tariq & Mohammed (1).pptx
AbooddSandoqaa
?
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
?
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
?
MCB Internship report for the year of 2025
MCB Internship report for the year of 2025
PakistanPrinting
?
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
benediktnetzer1
?
THE LINEAR REGRESSION MODEL: AN OVERVIEW
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Ameya Patekar
?
Camuflaje Tipos Características Militar 2025.ppt
Camuflaje Tipos Características Militar 2025.ppt
e58650738
?
@Reset-Password.pptx presentakh;kenvtion
@Reset-Password.pptx presentakh;kenvtion
MarkLariosa1
?
deep_learning_presentation related to llm
deep_learning_presentation related to llm
sayedabdussalam11
?
Top network design for infrastructure for it
Top network design for infrastructure for it
GUESH8
?
Communication_Skills_Class10_Visual.pptx
Communication_Skills_Class10_Visual.pptx
namanrastogi70555
?
Ad

BeakerX - Tiezheng Li

Editor's Notes

  • #2: Good morning and welcome to this session. The Beaker Extensions for Jupyter: BeakerX Before that let me first introduce myself and what I do. My name is Tiezheng Li I am a software engineer at Two Sigma Since joining Two Sigma I’ve been working on a team that builds products for Modelers that make data easy to discover, consume, publish and visualize in Two Sigma BeakerX is one of our approaches to accomplish this goal.