狠狠撸

狠狠撸Share a Scribd company logo
Energy 
Gunnar Carlsson 
President & Co-Founder, Ayasdi
AYASDI Company Timeline 
2000 2005 2008 2010 2013 
“ 
Ayasdi’s approach is using Topological Data Analysis one of 
the 
top 10 innovations developed at DARPA in the last decade. 
” Tony Tether, Director 
Defense Advanced Research Projects Agency (2001-2009)
Data has shape, 
Shape has meaning.
Linear Methods 
v
Clustering
v 
v 
v 
v 
v 
Circular
Limitations with Current Methods 
Statistics 
Machine 
Learning 
? Hypothesis focused 
? Model Driven 
? Formula Driven 
? Black-box analytics 
Miss subtle 
signals 
Missed 
systematic 
phenomena
Real Data = Real Complex 
? Real World Data does not adhere to models 
? Deep analysis requires taking the “model” 
assumption out of the equation 
Ayasdi Core analyzes the data you have, 
not the data you want to have.
Ayasdi’s Approach 
Key Takeaways 
1. Segmentation 
? Ex: Understanding how 
differences in completion 
can impact recovery 
1. Subtle Feature Extraction 
? Ex: identifying additional 
geological features that 
play a role in predicting 
recovery 
1. Anomaly Detection 
? Ex: Understanding state 
changes in SAGD wells 
Statistics 
Machine 
Learning 
Topological 
Data Analysis
Key Properties of TDA 
1 Coordinate Freeness 
Source Agnostic
Key Properties of TDA 
2 Deformation Invariance 
Noise & Null Tolerant
Key Properties of TDA 
3 Compressed Representation 
Expose All Signals
Mapping 
F
14 
Network Orientation 
The shape of the network 
shows underlying properties of 
data that yield insights and 
meaning 
Nodes are groups of similar objects 
Edges connect similar nodes 
Colors let you see values of interest 
Position of a node on the screen doesn’t 
matter
Relationships between diabetic, 
pre-diabetic and healthy populations 
Glucose 
Level 
Insulin 
Response 
Healthy Pre-Diabetic Overt-Diabetic
Analyzing Breast Cancer Data 
Death 
Survive 
d 
Relapsed 
No Relapse
Model Validation 
Understanding failure and improving performance 
Colored by model prediction Colored by actual outcomes 
Survival 
Low High
Ad

Recommended

2011 Big Data - Bigger Problems for Drug Discovery and Development
2011 Big Data - Bigger Problems for Drug Discovery and Development
Ayasdi
?
Identification of Brain Regions Related to Alzheimers' Diseases using MRI Ima...
Identification of Brain Regions Related to Alzheimers' Diseases using MRI Ima...
Association of Scientists, Developers and Faculties
?
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Seunghyun Hwang
?
Robust face recognition by applying partitioning around medoids over eigen fa...
Robust face recognition by applying partitioning around medoids over eigen fa...
ijcsa
?
[db tech showcase Tokyo 2018] #dbts2018 #B16 『The Basics of Machine Learning』
[db tech showcase Tokyo 2018] #dbts2018 #B16 『The Basics of Machine Learning』
Insight Technology, Inc.
?
Comprehensive Survey of Data Classification & Prediction Techniques
Comprehensive Survey of Data Classification & Prediction Techniques
ijsrd.com
?
Stephen Pinfield: Research Data Management and Libraries: Work in Progress
Stephen Pinfield: Research Data Management and Libraries: Work in Progress
Bodleian Libraries Staff Development
?
Machine Learning with Ayasdi
Machine Learning with Ayasdi
Ayasdi
?
MLconf NYC Pek Lum
MLconf NYC Pek Lum
MLconf
?
Causal discovery
Causal discovery
dagunisa
?
APS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor Rhone
TrevorDavidRhone
?
Fairness in Machine Learning
Fairness in Machine Learning
Delip Rao
?
Data Education project briefing for Royal Society
Data Education project briefing for Royal Society
Kate Farrell
?
Data Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
DeepaR42
?
Oracle openworld-presentation
Oracle openworld-presentation
Dr. Neil Brittliff
?
Introduction to EDA and Data Analytics with Power BI
Introduction to EDA and Data Analytics with Power BI
teodoroferiarevanojr
?
Optimizing Classification Models for Autism Spectrum Disorder(ASD) Detection ...
Optimizing Classification Models for Autism Spectrum Disorder(ASD) Detection ...
RafizKhan
?
Becoming Datacentric
Becoming Datacentric
Timothy Cook
?
Hanaa phd presentation 14-4-2017
Hanaa phd presentation 14-4-2017
Aboul Ella Hassanien
?
Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?
Cagatay Turkay
?
DATA MINING - CHARACTERISTICS and APPLICATION
DATA MINING - CHARACTERISTICS and APPLICATION
MD.ANISUR RAHMAN
?
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
lsharkey602
?
Big Data - A view
Big Data - A view
Dansk BiblioteksCenter
?
Unit 1.pptx
Unit 1.pptx
DrThenmozhiSPESUMCA
?
Data mining
Data mining
International Islamic University
?
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?
Gregory Piatetsky-Shapiro
?
Top (10) challenging problems in data mining
Top (10) challenging problems in data mining
Ahmedasbasb
?
Overview of Data Cleaning.pdf
Overview of Data Cleaning.pdf
SheetalDandge
?
KLIP2Data voor de herinrichting van R4 West en Oost
KLIP2Data voor de herinrichting van R4 West en Oost
jacoba18
?
THE LINEAR REGRESSION MODEL: AN OVERVIEW
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Ameya Patekar
?

More Related Content

Similar to Ayasdi Energy Summit, September 2014, Gunnar Carlsson (20)

MLconf NYC Pek Lum
MLconf NYC Pek Lum
MLconf
?
Causal discovery
Causal discovery
dagunisa
?
APS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor Rhone
TrevorDavidRhone
?
Fairness in Machine Learning
Fairness in Machine Learning
Delip Rao
?
Data Education project briefing for Royal Society
Data Education project briefing for Royal Society
Kate Farrell
?
Data Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
DeepaR42
?
Oracle openworld-presentation
Oracle openworld-presentation
Dr. Neil Brittliff
?
Introduction to EDA and Data Analytics with Power BI
Introduction to EDA and Data Analytics with Power BI
teodoroferiarevanojr
?
Optimizing Classification Models for Autism Spectrum Disorder(ASD) Detection ...
Optimizing Classification Models for Autism Spectrum Disorder(ASD) Detection ...
RafizKhan
?
Becoming Datacentric
Becoming Datacentric
Timothy Cook
?
Hanaa phd presentation 14-4-2017
Hanaa phd presentation 14-4-2017
Aboul Ella Hassanien
?
Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?
Cagatay Turkay
?
DATA MINING - CHARACTERISTICS and APPLICATION
DATA MINING - CHARACTERISTICS and APPLICATION
MD.ANISUR RAHMAN
?
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
lsharkey602
?
Big Data - A view
Big Data - A view
Dansk BiblioteksCenter
?
Unit 1.pptx
Unit 1.pptx
DrThenmozhiSPESUMCA
?
Data mining
Data mining
International Islamic University
?
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?
Gregory Piatetsky-Shapiro
?
Top (10) challenging problems in data mining
Top (10) challenging problems in data mining
Ahmedasbasb
?
Overview of Data Cleaning.pdf
Overview of Data Cleaning.pdf
SheetalDandge
?
MLconf NYC Pek Lum
MLconf NYC Pek Lum
MLconf
?
Causal discovery
Causal discovery
dagunisa
?
APS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor Rhone
TrevorDavidRhone
?
Fairness in Machine Learning
Fairness in Machine Learning
Delip Rao
?
Data Education project briefing for Royal Society
Data Education project briefing for Royal Society
Kate Farrell
?
Data Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
DeepaR42
?
Introduction to EDA and Data Analytics with Power BI
Introduction to EDA and Data Analytics with Power BI
teodoroferiarevanojr
?
Optimizing Classification Models for Autism Spectrum Disorder(ASD) Detection ...
Optimizing Classification Models for Autism Spectrum Disorder(ASD) Detection ...
RafizKhan
?
Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?
Cagatay Turkay
?
DATA MINING - CHARACTERISTICS and APPLICATION
DATA MINING - CHARACTERISTICS and APPLICATION
MD.ANISUR RAHMAN
?
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
lsharkey602
?
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?
Gregory Piatetsky-Shapiro
?
Top (10) challenging problems in data mining
Top (10) challenging problems in data mining
Ahmedasbasb
?
Overview of Data Cleaning.pdf
Overview of Data Cleaning.pdf
SheetalDandge
?

Recently uploaded (20)

KLIP2Data voor de herinrichting van R4 West en Oost
KLIP2Data voor de herinrichting van R4 West en Oost
jacoba18
?
THE LINEAR REGRESSION MODEL: AN OVERVIEW
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Ameya Patekar
?
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
taqyea
?
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Ameya Patekar
?
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
?
Verweven van EM Legacy en OTL-data bij AWV
Verweven van EM Legacy en OTL-data bij AWV
jacoba18
?
QUALITATIVE EXPLANATORY VARIABLES REGRESSION MODELS
QUALITATIVE EXPLANATORY VARIABLES REGRESSION MODELS
Ameya Patekar
?
最新版美国史蒂文斯理工学院毕业证(厂滨罢毕业证书)原版定制
最新版美国史蒂文斯理工学院毕业证(厂滨罢毕业证书)原版定制
Taqyea
?
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Ameya Patekar
?
SUNSSE Engineering Introduction 2021.pdf
SUNSSE Engineering Introduction 2021.pdf
Ongkino
?
Top network design for infrastructure for it
Top network design for infrastructure for it
GUESH8
?
reporting monthly for genset & Air Compressor.pptx
reporting monthly for genset & Air Compressor.pptx
dacripapanjaitan
?
Data Visualisation in data science for students
Data Visualisation in data science for students
confidenceascend
?
presentation4.pdf Intro to mcmc methodss
presentation4.pdf Intro to mcmc methodss
SergeyTsygankov6
?
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
bhavaniteacher99
?
MRI Pulse Sequence in radiology physics.pptx
MRI Pulse Sequence in radiology physics.pptx
BelaynehBishaw
?
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
?
Data Warehousing and Analytics IFI Techsolutions .pptx
Data Warehousing and Analytics IFI Techsolutions .pptx
IFI Techsolutions
?
FME Beyond Data Processing: Creating a Dartboard Accuracy App
FME Beyond Data Processing: Creating a Dartboard Accuracy App
jacoba18
?
KLIP2Data voor de herinrichting van R4 West en Oost
KLIP2Data voor de herinrichting van R4 West en Oost
jacoba18
?
THE LINEAR REGRESSION MODEL: AN OVERVIEW
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Ameya Patekar
?
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
最新版美国加利福尼亚大学旧金山法学院毕业证(鲍颁尝补飞厂贵毕业证书)定制
taqyea
?
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Ameya Patekar
?
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
?
Verweven van EM Legacy en OTL-data bij AWV
Verweven van EM Legacy en OTL-data bij AWV
jacoba18
?
QUALITATIVE EXPLANATORY VARIABLES REGRESSION MODELS
QUALITATIVE EXPLANATORY VARIABLES REGRESSION MODELS
Ameya Patekar
?
最新版美国史蒂文斯理工学院毕业证(厂滨罢毕业证书)原版定制
最新版美国史蒂文斯理工学院毕业证(厂滨罢毕业证书)原版定制
Taqyea
?
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
?
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Ameya Patekar
?
SUNSSE Engineering Introduction 2021.pdf
SUNSSE Engineering Introduction 2021.pdf
Ongkino
?
Top network design for infrastructure for it
Top network design for infrastructure for it
GUESH8
?
reporting monthly for genset & Air Compressor.pptx
reporting monthly for genset & Air Compressor.pptx
dacripapanjaitan
?
Data Visualisation in data science for students
Data Visualisation in data science for students
confidenceascend
?
presentation4.pdf Intro to mcmc methodss
presentation4.pdf Intro to mcmc methodss
SergeyTsygankov6
?
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
SQL-Demystified-A-Beginners-Guide-to-Database-Mastery.pptx
bhavaniteacher99
?
MRI Pulse Sequence in radiology physics.pptx
MRI Pulse Sequence in radiology physics.pptx
BelaynehBishaw
?
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
?
Data Warehousing and Analytics IFI Techsolutions .pptx
Data Warehousing and Analytics IFI Techsolutions .pptx
IFI Techsolutions
?
FME Beyond Data Processing: Creating a Dartboard Accuracy App
FME Beyond Data Processing: Creating a Dartboard Accuracy App
jacoba18
?
Ad

Ayasdi Energy Summit, September 2014, Gunnar Carlsson

  • 1. Energy Gunnar Carlsson President & Co-Founder, Ayasdi
  • 2. AYASDI Company Timeline 2000 2005 2008 2010 2013 “ Ayasdi’s approach is using Topological Data Analysis one of the top 10 innovations developed at DARPA in the last decade. ” Tony Tether, Director Defense Advanced Research Projects Agency (2001-2009)
  • 3. Data has shape, Shape has meaning.
  • 6. v v v v v Circular
  • 7. Limitations with Current Methods Statistics Machine Learning ? Hypothesis focused ? Model Driven ? Formula Driven ? Black-box analytics Miss subtle signals Missed systematic phenomena
  • 8. Real Data = Real Complex ? Real World Data does not adhere to models ? Deep analysis requires taking the “model” assumption out of the equation Ayasdi Core analyzes the data you have, not the data you want to have.
  • 9. Ayasdi’s Approach Key Takeaways 1. Segmentation ? Ex: Understanding how differences in completion can impact recovery 1. Subtle Feature Extraction ? Ex: identifying additional geological features that play a role in predicting recovery 1. Anomaly Detection ? Ex: Understanding state changes in SAGD wells Statistics Machine Learning Topological Data Analysis
  • 10. Key Properties of TDA 1 Coordinate Freeness Source Agnostic
  • 11. Key Properties of TDA 2 Deformation Invariance Noise & Null Tolerant
  • 12. Key Properties of TDA 3 Compressed Representation Expose All Signals
  • 14. 14 Network Orientation The shape of the network shows underlying properties of data that yield insights and meaning Nodes are groups of similar objects Edges connect similar nodes Colors let you see values of interest Position of a node on the screen doesn’t matter
  • 15. Relationships between diabetic, pre-diabetic and healthy populations Glucose Level Insulin Response Healthy Pre-Diabetic Overt-Diabetic
  • 16. Analyzing Breast Cancer Data Death Survive d Relapsed No Relapse
  • 17. Model Validation Understanding failure and improving performance Colored by model prediction Colored by actual outcomes Survival Low High

Editor's Notes

  • #3: Introduce slide with the PUNCHLINE: AYASDI is a new old company – still growing and dynamically adapting to use cases, but founded on an established and proven technology 狠狠撸 intended to establish institutional validity from both intellectual (Stanford / DARPA / NSF / Gunnar) and financial (Khosla / Floodgate / IVP / GE / Citi) standpoint Narrative: 2000 Initial Research – NSF funds Stanford Math Professor Gunnar Carlsson to conduct theoretical research into computational topology 2005 DARPA – DARPA invests in applying TDA to massive, complex, multi-modal DoD data 2008 AYASDI founded to explore commercial applications with DARPA and IARPA funding 2010 Floodgate / Ovitz (Seed) –Floodgate and private investors provide seed capital to develop platform 2013 Khosla / IVP / Citi / GE (Growth) – Khosla Ventures and Institutional Venture Partners lead growth financing rounds, scale for institutional deployment
  • #5: Linear example Data has shape, shape has meaning
  • #6: Clustering Example
  • #7: Circle example
  • #8: Need real life examples for oil & Gas
  • #9: Real Data is complex and unpredictable? it doesn’t adhere to models and form shapes nicely Ayasdi ? transition to the flexibility of tda The actual data that youhave, not the data that can fit in your own model
  • #10: Title of use cases needs work….
  • #14: Change title? ask Gunnar
  • #15: A node is a group of data points that are similar to each other. Edges connect nodes with data points in common. The position of a node on the screen doesn’t matter – the network is not a plot, there are no axes. What matters is how a node is connected to the nodes around it, and the network as a whole. We use color to explore correlations between data aspects (attributes/features/columns) and the structure of the network.
  • #16: Glucose Level Insulin Response Healthy Pre-diabetic Overt-diabetic
  • #17: Glucose Level Insulin Response Healthy Pre-diabetic Overt-diabetic
  • #18: NB: this is an Iris 2 network and if we can find the data we will work to replace it. Data Patients coming into an emergent care center are assessed for a number of different factors – did they come in on their own? Are they actively bleeding? Can they answer questions? Network built on that worksheet information Meta data composed of 1) model prediction, and 2) actual outcome Network Each node is a group of patients similar to each other Red nodes indicate poor survival, blue nodes indicate good survival, either predicted (left) or actual (right) Punchline: the flare at the top is a group of patients for whom the model fails to predict poor outcomes. We can explore the statistics of this flare, and find that the key distinguishing feature is missing information, from one section of the worksheet in particular.