際際滷

際際滷Share a Scribd company logo
Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 1
Data Analytics Explained MeetupIrfan Elahi - Deloitte
 Working as a Senior Consultant in Deloitte (Analytics Service Line)
 Trainer of Deloittes Data Science Training
 Speaker at DataWorks Summit, Sydney (2017)
 Premium Udemy Instructor with 17,000+ students from 131 countries
 Technical Reviewer of an upcoming book on Hadoop published by APress
About Me
Irfan Elahi - Deloitte Data Analytics Explained Meetup
 The Three Phenomena
 View :: In Isolation -> Conjunction
 Demo and Take-away
Agenda
Irfan Elahi - Deloitte Data Analytics Explained Meetup
The drivers behind instrumenting innovation and
provisioning substantial value in capitalizing data-
assets of businesses:
The Three Phenomena
Intelligence
Scalability
Elasticity
Irfan Elahi - Deloitte Data Analytics Explained Meetup
Intelligence
Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 6
Copyright Deloitte 2015
Intelligence
6
Value
Solution:
Scalability?
Traditional
Approach:
Single Node
In-Memory
Lifecycle:
Acquire -> Transform -> Exploratory Analytics ->
Feature Engineering -> Model Development ->
Evaluation
+ Coverage
+ Strong Visualization
+ Mutability
- Constrained
Resources aka non-
scalable
- Compromise in Data
Locality
- Extensive Engineering
for Productionizing
Tools/Technologies:
 R
 Python (scikit-learn, pandas,
numpy)
 Java (Weka)
 RapidMiner
Irfan Elahi - Deloitte Data Analytics Explained Meetup
Scalability
Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 8
Copyright Deloitte 2015
Scalability
8
8
Value
 Analytics @ Computation Frameworks (Apache Spark, Apache
Flink, Apache Ignite)
 Boutique Analytics Libraries (H2O, DL4J)
 Integration with Traditional tool-set (SparklyR)
 Analytics @ Cloud (Azure ML, AWS ML)
Taxonomy of Scalable Analytics
+ Scalable  Better Intelligence
+ Streamlined Architecture
+ Less engineering overhead
+ Data Locality Optimized
Pros and Cons
- Limited Coverage
- Visualization
- Resourcing for GPUs
Infrastructure Provisioning -> Data Ingestion -> Processing ->
Persistence
Lifecycle:
Time to Value?
Solution:
Cloud?
Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 9
Copyright Deloitte 2015
Scalability
9
9
Irfan Elahi - Deloitte Data Analytics Explained Meetup
Elasticity
Irfan Elahi - Deloitte Data Analytics Explained Meetup
01
03
02
04
Rapid Time to Value
Fit for Transient
Loads
Fit For Scalable
Analytics
infrastructure
Pay & Scale As you
Go
ElasticityView :: In Isolation
Irfan Elahi - Deloitte Data Analytics Explained Meetup
ElasticityDemo
Irfan Elahi - Deloitte Data Analytics Explained Meetup
Elasticity
Intelligence Scalability Elasticity
 For non Big Data problems or rapid prototyping, traditional tool-sets provide better value
 True value for performing analytics at scale with right data lies in leveraging the intelligence,
scalability and elasticity in conjunction
 The conjunction of the three still has challenges and isnt the answer for every solution, yet
Questions?
Irfan Elahi - Deloitte Data Analytics Explained Meetup
Enrol in my best selling course on Apache Spark for Big Data Analytics at 90% off
price:
https://www.udemy.com/apache-spark-hands-on-course-big-data-
analytics/?couponCode=YOUTUBE2018

More Related Content

Scalable Analytics on the Cloud

  • 1. Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 1
  • 2. Data Analytics Explained MeetupIrfan Elahi - Deloitte Working as a Senior Consultant in Deloitte (Analytics Service Line) Trainer of Deloittes Data Science Training Speaker at DataWorks Summit, Sydney (2017) Premium Udemy Instructor with 17,000+ students from 131 countries Technical Reviewer of an upcoming book on Hadoop published by APress About Me
  • 3. Irfan Elahi - Deloitte Data Analytics Explained Meetup The Three Phenomena View :: In Isolation -> Conjunction Demo and Take-away Agenda
  • 4. Irfan Elahi - Deloitte Data Analytics Explained Meetup The drivers behind instrumenting innovation and provisioning substantial value in capitalizing data- assets of businesses: The Three Phenomena Intelligence Scalability Elasticity
  • 5. Irfan Elahi - Deloitte Data Analytics Explained Meetup Intelligence
  • 6. Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 6 Copyright Deloitte 2015 Intelligence 6 Value Solution: Scalability? Traditional Approach: Single Node In-Memory Lifecycle: Acquire -> Transform -> Exploratory Analytics -> Feature Engineering -> Model Development -> Evaluation + Coverage + Strong Visualization + Mutability - Constrained Resources aka non- scalable - Compromise in Data Locality - Extensive Engineering for Productionizing Tools/Technologies: R Python (scikit-learn, pandas, numpy) Java (Weka) RapidMiner
  • 7. Irfan Elahi - Deloitte Data Analytics Explained Meetup Scalability
  • 8. Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 8 Copyright Deloitte 2015 Scalability 8 8 Value Analytics @ Computation Frameworks (Apache Spark, Apache Flink, Apache Ignite) Boutique Analytics Libraries (H2O, DL4J) Integration with Traditional tool-set (SparklyR) Analytics @ Cloud (Azure ML, AWS ML) Taxonomy of Scalable Analytics + Scalable Better Intelligence + Streamlined Architecture + Less engineering overhead + Data Locality Optimized Pros and Cons - Limited Coverage - Visualization - Resourcing for GPUs Infrastructure Provisioning -> Data Ingestion -> Processing -> Persistence Lifecycle: Time to Value? Solution: Cloud?
  • 9. Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 9 Copyright Deloitte 2015 Scalability 9 9
  • 10. Irfan Elahi - Deloitte Data Analytics Explained Meetup Elasticity
  • 11. Irfan Elahi - Deloitte Data Analytics Explained Meetup 01 03 02 04 Rapid Time to Value Fit for Transient Loads Fit For Scalable Analytics infrastructure Pay & Scale As you Go ElasticityView :: In Isolation
  • 12. Irfan Elahi - Deloitte Data Analytics Explained Meetup ElasticityDemo
  • 13. Irfan Elahi - Deloitte Data Analytics Explained Meetup Elasticity Intelligence Scalability Elasticity For non Big Data problems or rapid prototyping, traditional tool-sets provide better value True value for performing analytics at scale with right data lies in leveraging the intelligence, scalability and elasticity in conjunction The conjunction of the three still has challenges and isnt the answer for every solution, yet Questions?
  • 14. Irfan Elahi - Deloitte Data Analytics Explained Meetup Enrol in my best selling course on Apache Spark for Big Data Analytics at 90% off price: https://www.udemy.com/apache-spark-hands-on-course-big-data- analytics/?couponCode=YOUTUBE2018