ݺߣ

ݺߣShare a Scribd company logo
*
@daniel_abadi
Yale University
* The Big Data phenomenon is the best thing that
could have happened to the database
community
* Despite other definitions related to ‘3 Vs’ --Big Data means BIG Data

* Which means we need scalable database systems

* Still two main components of Big Data
* Performing data analysis at scale
* Performing requests on data at scale

*
* Database community has won the battle

* Some thought that MapReduce might replace

traditional database technology as the primary
means to perform analysis at scale
* Just about every MapReduce vendor has abandoned
this goal
* Hadapt, Impala, Tez, and several others are in a
race to see who can add the most traditional
database execution technology to Hadoop fastest
* Everyone is going in the direction of cost-based
optimizers, traditional database operators, and
push-based query execution

*
* The database community is losing the battle

* NoSQL systems still have very little traditional database
technology inside (despite adding SQL interfaces)
* No race to add DB technology --- why?

* Don’t blame CAP --- CAP is only relevant when there’s a
*

network partition
We never figured out how to do ACID and active replication at
scale

*

Many new proposals make simplifying assumptions in order to
handle scale

* It’s been 30 years ---- why can’t we build a distributed

database that can handle distributed transactions over
actively replicated data at scale?

*

More Related Content

What's hot (20)

PPTX
Hadoop bigdata overview
harithakannan
PPTX
Hadoop and Big Data
Harshdeep Kaur
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
PPTX
Apache Hadoop
Ajit Koti
PPTX
PPT on Hadoop
Shubham Parmar
PDF
Apache Hadoop - Big Data Engineering
BADR
PDF
Seminar_Report_hadoop
Varun Narang
PPT
Cloud Computing: Hadoop
darugar
PPTX
Big Data and Hadoop
Flavio Vit
ODP
Big data, map reduce and beyond
datasalt
PPTX
Big Data & Hadoop Tutorial
Edureka!
PPTX
Hadoop introduction
musrath mohammad
PPTX
عصر کلان داده، چرا و چگونه؟
datastack
PPTX
Hadoop
Anil Reddy
DOCX
Hadoop technology doc
tipanagiriharika
PPTX
Big data concepts
Serkan Özal
DOCX
Hadoop Seminar Report
Bhushan Kulkarni
PPTX
Big data ppt
Thirunavukkarasu Ps
PPT
Seminar Presentation Hadoop
Varun Narang
PDF
Introduction to Hadoop and MapReduce
eakasit_dpu
Hadoop bigdata overview
harithakannan
Hadoop and Big Data
Harshdeep Kaur
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
Apache Hadoop
Ajit Koti
PPT on Hadoop
Shubham Parmar
Apache Hadoop - Big Data Engineering
BADR
Seminar_Report_hadoop
Varun Narang
Cloud Computing: Hadoop
darugar
Big Data and Hadoop
Flavio Vit
Big data, map reduce and beyond
datasalt
Big Data & Hadoop Tutorial
Edureka!
Hadoop introduction
musrath mohammad
عصر کلان داده، چرا و چگونه؟
datastack
Hadoop technology doc
tipanagiriharika
Big data concepts
Serkan Özal
Hadoop Seminar Report
Bhushan Kulkarni
Seminar Presentation Hadoop
Varun Narang
Introduction to Hadoop and MapReduce
eakasit_dpu

Viewers also liked (7)

PDF
Invisible loading
Daniel Abadi
PPTX
Leopard: Lightweight Partitioning and Replication for Dynamic Graphs
Daniel Abadi
PDF
Consistency Tradeoffs in Modern Distributed Database System Design
Arinto Murdopo
PDF
VLDB 2009 Tutorial on Column-Stores
Daniel Abadi
PPTX
The Power of Determinism in Database Systems
Daniel Abadi
PPT
CAP, PACELC, and Determinism
Daniel Abadi
PPT
Column-Stores vs. Row-Stores: How Different are they Really?
Daniel Abadi
Invisible loading
Daniel Abadi
Leopard: Lightweight Partitioning and Replication for Dynamic Graphs
Daniel Abadi
Consistency Tradeoffs in Modern Distributed Database System Design
Arinto Murdopo
VLDB 2009 Tutorial on Column-Stores
Daniel Abadi
The Power of Determinism in Database Systems
Daniel Abadi
CAP, PACELC, and Determinism
Daniel Abadi
Column-Stores vs. Row-Stores: How Different are they Really?
Daniel Abadi
Ad

Similar to Beckman abadi-5min-pres (20)

PPTX
Information processing architectures
Raji Gogulapati
PDF
Vikram Andem Big Data Strategy @ IATA Technology Roadmap
IT Strategy Group
PDF
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET Journal
PPT
NoSQL Basics - a quick tour
Bikram Sinha. MBA, PMP
PPTX
Big data management
zeba khanam
PDF
Using BIG DATA implementations onto Software Defined Networking
IJCSIS Research Publications
PDF
Hadoop
Veera Sundari
PPTX
Introduction to Cloud computing and Big Data-Hadoop
Nagarjuna D.N
PDF
Big Data: hype or necessity?
Bart Vandewoestyne
PDF
Big Data: hype or necessity?
Bart Vandewoestyne
PPTX
The Six pillars for Building big data analytics ecosystems
taimur hafeez
PDF
Big Data: an introduction
Bart Vandewoestyne
PPT
Big data edel
Edel Rajakumari
PDF
Big data and hadoop overvew
Kunal Khanna
PDF
Big Data using NoSQL Technologies
Amit Singh
PPT
Seminar presentation
Klawal13
PDF
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
PPTX
Jax 2013 - Big Data and Personalised Medicine
Gaurav Kaul
PDF
A Survey on Big Data Analysis Techniques
ijsrd.com
DOCX
Hadoop Seminar Report
Atul Kushwaha
Information processing architectures
Raji Gogulapati
Vikram Andem Big Data Strategy @ IATA Technology Roadmap
IT Strategy Group
IRJET- Systematic Review: Progression Study on BIG DATA articles
IRJET Journal
NoSQL Basics - a quick tour
Bikram Sinha. MBA, PMP
Big data management
zeba khanam
Using BIG DATA implementations onto Software Defined Networking
IJCSIS Research Publications
Introduction to Cloud computing and Big Data-Hadoop
Nagarjuna D.N
Big Data: hype or necessity?
Bart Vandewoestyne
Big Data: hype or necessity?
Bart Vandewoestyne
The Six pillars for Building big data analytics ecosystems
taimur hafeez
Big Data: an introduction
Bart Vandewoestyne
Big data edel
Edel Rajakumari
Big data and hadoop overvew
Kunal Khanna
Big Data using NoSQL Technologies
Amit Singh
Seminar presentation
Klawal13
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
Jax 2013 - Big Data and Personalised Medicine
Gaurav Kaul
A Survey on Big Data Analysis Techniques
ijsrd.com
Hadoop Seminar Report
Atul Kushwaha
Ad

Beckman abadi-5min-pres

  • 2. * The Big Data phenomenon is the best thing that could have happened to the database community * Despite other definitions related to ‘3 Vs’ --Big Data means BIG Data * Which means we need scalable database systems * Still two main components of Big Data * Performing data analysis at scale * Performing requests on data at scale *
  • 3. * Database community has won the battle * Some thought that MapReduce might replace traditional database technology as the primary means to perform analysis at scale * Just about every MapReduce vendor has abandoned this goal * Hadapt, Impala, Tez, and several others are in a race to see who can add the most traditional database execution technology to Hadoop fastest * Everyone is going in the direction of cost-based optimizers, traditional database operators, and push-based query execution *
  • 4. * The database community is losing the battle * NoSQL systems still have very little traditional database technology inside (despite adding SQL interfaces) * No race to add DB technology --- why? * Don’t blame CAP --- CAP is only relevant when there’s a * network partition We never figured out how to do ACID and active replication at scale * Many new proposals make simplifying assumptions in order to handle scale * It’s been 30 years ---- why can’t we build a distributed database that can handle distributed transactions over actively replicated data at scale? *