ݺߣ

ݺߣShare a Scribd company logo
4
Most read
10
Most read
Making Big Data Roar
Data Centers are expensive 
Company Location Data Center Cost Data Center Size MW 
NSA Camp Williams, UT $2B 133 
Apple Maiden, NC $1B 67 
Internet Villages Annandale, Scot. $1.6B 107 
Lockerbie DC Lockerbie, Scotland $1.5B 100 
Social Security Baltimore, MD $400M 27 
Next Generation Data Wales, UK $300M 20 
Facebook Princeville, OR $215M 15
WiredTiger Mission 
WiredTiger is rethinking data 
management for modern hardware 
with a focus on multi-core scalability 
and maximizing the value of every 
byte of RAM.
Database/Storage Ecosystem
A New Data Management Engine 
● Architected for modern computer systems 
● Scalable and able to handle big data 
● High throughput, consistent low latency 
● Row-store, column-store, log structured merge 
● ACID transactions, standard isolation levels 
● Checkpoint and fine-grained durability 
● Supporting columns, indices, projections 
● Production quality, fully supported 
● NoSQL, Open Source
Flexible Storage 
● Access methods tailored to workload 
o Row store (read mostly of all columns) 
o Column store (read mostly of some columns) 
o Log-structured merge trees (mostly random writes) 
● Compact storage format 
o RLE, key-prefix, dictionary and static compression 
o Stream compression 
● Adapt workload to storage (RAM, SSD, HDD)
Flexible Configuration 
● API offers a simple key/value store, or 
● A complete schema layer 
o Specify data types 
o Map columns to files 
o Automatically maintain indices 
o Queries only read required columns 
o Projections, index-only scans 
● Checkpoint or fine-grained durability
Improved Efficiency 
● Higher CPU Utilization 
o Multi-core scalability 
o Minimize contention 
between threads 
o Non-locking 
algorithms 
o Hazard pointers 
● Lower Power Costs 
● Flash Optimized Block 
Layout
Consistent High Performance 
● In-cache or I/O bound 
● Workload Configuration 
o Efficient sparse data 
(column-store) 
o Bounded queries and 
updates (row-store) 
o Write-optimized 
(LSM) 
● Data structures for 
access at RAM speed
Consistent Low Latency 
● Non-locking algorithms 
● Multi-versioned data 
● Optimistic concurrency 
control 
● Deadlock-free 
transactions 
● I/O shifted to 
background threads
Cost Effective 
Metric 
iiBench run cost $6.44 $12.88 
Cost per Billion 
$20.30 $40.60 
inserts* 
● WiredTiger provides a 50% cost savings for the same AWS workload 
● More details on this benchmark are available here.
Customers
Management Team 
Keith Bostic is a founder and architect at WiredTiger. He was a founder of Sleepycat Software, 
(acquired by Oracle Corp. in 2006), and one of the architects of the Berkeley DB, the most widely-used 
embedded data management software in the world. 
Mr. Bostic was one of architects of the University of California, Berkeley, 2.10BSD and 4BSD releases, 
where he lead the 4BSD release Open Source effort. He is the recipient of a USENIX Association 
Lifetime Achievement Award (The Flame), which recognizes singular contributions to the UNIX 
community. 
Dr. Michael Cahill is a founder and architect at WiredTiger. He was an architect of Berkeley DB at 
Sleepycat Software and Oracle Corp., responsible for design and implementation of multiversion 
concurrency control, as well as SQL interfaces and programming language APIs. Previously, Dr. 
Cahill was CTO at Bullant Technology, which grew tenfold and raised over US$30 million from 
investors including Intel Capital and JP Morgan during his three year tenure. 
Dr. Cahill’s PhD from the University of Sydney is in the area of transaction processing and 
concurrency control. His work on a new algorithm for implementing serializable isolation received an 
ACM SIGMOD Best Paper award and was added to PostgreSQL 9.1.
Summary and Next Steps 
We’d like to discuss how we could help you 
with your solution. 
Thanks! Questions? info@wiredtiger.com

More Related Content

What's hot (20)

PDF
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
Ontico
PPTX
What'sNnew in 3.0 Webinar
MongoDB
PPTX
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Vigyan Jain
PPTX
Mongo DB
Karan Kukreja
PDF
Common MongoDB Use Cases
DATAVERSITY
KEY
Mongo Seattle - The Business of MongoDB
Justin Smestad
PPTX
Prepare for Peak Holiday Season with MongoDB
MongoDB
PPTX
Azure storage
Adam Skibicki
PPT
MongoDB Pros and Cons
johnrjenson
PPTX
Agility and Scalability with MongoDB
MongoDB
PPTX
Getting started with postgresql
botsplash.com
PPTX
What's new in MongoDB 2.6
Matias Cascallares
PPTX
Securing Your Enterprise Web Apps with MongoDB Enterprise
MongoDB
PPSX
Microsoft Hekaton
Siraj Memon
PPTX
In-memory Databases
Robert Friberg
PDF
NoSQL benchmarking
Prasoon Kumar
KEY
MongoDB vs Mysql. A devops point of view
Pierre Baillet
PPTX
When to Use MongoDB...and When You Should Not...
MongoDB
PPTX
3 scenarios when to use MongoDB!
Edureka!
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Prasoon Kumar
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
Ontico
What'sNnew in 3.0 Webinar
MongoDB
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Vigyan Jain
Common MongoDB Use Cases
DATAVERSITY
Mongo Seattle - The Business of MongoDB
Justin Smestad
Prepare for Peak Holiday Season with MongoDB
MongoDB
Azure storage
Adam Skibicki
MongoDB Pros and Cons
johnrjenson
Agility and Scalability with MongoDB
MongoDB
Getting started with postgresql
botsplash.com
What's new in MongoDB 2.6
Matias Cascallares
Securing Your Enterprise Web Apps with MongoDB Enterprise
MongoDB
Microsoft Hekaton
Siraj Memon
In-memory Databases
Robert Friberg
NoSQL benchmarking
Prasoon Kumar
MongoDB vs Mysql. A devops point of view
Pierre Baillet
When to Use MongoDB...and When You Should Not...
MongoDB
3 scenarios when to use MongoDB!
Edureka!
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Prasoon Kumar

Similar to WiredTiger Overview (20)

PDF
Greatdebate Postgres vs Mysql
Krishna Infosoft
PDF
The Great Debate: PostgreSQL vs MySQL
EDB
PDF
Free Software and the Future of Database Technology
elliando dias
PDF
Pr dc 2015 sql server is cheaper than open source
Terry Bunio
PDF
PostgreSQL and MySQL
PostgreSQL Experts, Inc.
PPTX
Beyond the Basics 1: Storage Engines
MongoDB
PDF
The Evolution of Open Source Databases
Ivan Zoratti
PDF
Kickfire: Best Of All Worlds
Enterprise Technology Management (ETM)
PPTX
Conceptos Avanzados 1: Motores de Almacenamiento
MongoDB
PPTX
Modernizing Mission-Critical Apps with SQL Server
Microsoft Tech Community
PDF
Why Migrate from MySQL to Cassandra
DATAVERSITY
PDF
Extending The My Sql Data Landscape
Ronald Bradford
ODP
Databases benoitg 2009-03-10
benoitg
PDF
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Inside Analysis
PDF
SLQ vs NOSQL - friends or foes
Miguel Araújo
PPTX
MongoDB World 2015 - A Technical Introduction to WiredTiger
WiredTiger
KEY
Austin NoSQL 2011-07-06
jimbojsb
PDF
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
KEY
SLQ vs NOSQL - friends or foes
Pedro Gomes
PPTX
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
Greatdebate Postgres vs Mysql
Krishna Infosoft
The Great Debate: PostgreSQL vs MySQL
EDB
Free Software and the Future of Database Technology
elliando dias
Pr dc 2015 sql server is cheaper than open source
Terry Bunio
PostgreSQL and MySQL
PostgreSQL Experts, Inc.
Beyond the Basics 1: Storage Engines
MongoDB
The Evolution of Open Source Databases
Ivan Zoratti
Kickfire: Best Of All Worlds
Enterprise Technology Management (ETM)
Conceptos Avanzados 1: Motores de Almacenamiento
MongoDB
Modernizing Mission-Critical Apps with SQL Server
Microsoft Tech Community
Why Migrate from MySQL to Cassandra
DATAVERSITY
Extending The My Sql Data Landscape
Ronald Bradford
Databases benoitg 2009-03-10
benoitg
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Inside Analysis
SLQ vs NOSQL - friends or foes
Miguel Araújo
MongoDB World 2015 - A Technical Introduction to WiredTiger
WiredTiger
Austin NoSQL 2011-07-06
jimbojsb
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
SLQ vs NOSQL - friends or foes
Pedro Gomes
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
Ad

Recently uploaded (20)

PDF
ٲàԲԱ
juadsr96
DOCX
COT Feb 19, 2025 DLLgvbbnnjjjjjj_Digestive System and its Functions_PISA_CBA....
kayemorales1105
PPT
Reliability Monitoring of Aircrfat commerce
Rizk2
PDF
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
PPTX
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
PPTX
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
PDF
Business Automation Solution with Excel 1.1.pdf
Vivek Kedia
PPTX
Monitoring Improvement ( Pomalaa Branch).pptx
fajarkunee
PPTX
Model Evaluation & Visualisation part of a series of intro modules for data ...
brandonlee626749
PDF
5991-5857_Agilent_MS_Theory_EN (1).pdf. pdf
NohaSalah45
PPTX
covid 19 data analysis updates in our municipality
RhuAyungon1
PPTX
Presentation.pptx hhgihyugyygyijguuffddfffffff
abhiruppal2007
PPT
intro to AI dfg fgh gggdrhre ghtwhg ewge
traineramrsiam
PPTX
Data anlytics Hospitals Research India.pptx
SayantanChakravorty2
PPTX
Daily, Weekly, Monthly Report MTC March 2025.pptx
PanjiDewaPamungkas1
DOCX
brigada_PROGRAM_25.docx the boys white house
RonelNebrao
PPTX
microservices-with-container-apps-dapr.pptx
vjay22
PPTX
RESEARCH-FINAL-GROUP-3, about the final .pptx
gwapokoha1
PPTX
Project_Update_Summary.for the use from PM
Odysseas Lekatsas
PDF
Informatics Market Insights AI Workforce.pdf
karizaroxx
ٲàԲԱ
juadsr96
COT Feb 19, 2025 DLLgvbbnnjjjjjj_Digestive System and its Functions_PISA_CBA....
kayemorales1105
Reliability Monitoring of Aircrfat commerce
Rizk2
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
Business Automation Solution with Excel 1.1.pdf
Vivek Kedia
Monitoring Improvement ( Pomalaa Branch).pptx
fajarkunee
Model Evaluation & Visualisation part of a series of intro modules for data ...
brandonlee626749
5991-5857_Agilent_MS_Theory_EN (1).pdf. pdf
NohaSalah45
covid 19 data analysis updates in our municipality
RhuAyungon1
Presentation.pptx hhgihyugyygyijguuffddfffffff
abhiruppal2007
intro to AI dfg fgh gggdrhre ghtwhg ewge
traineramrsiam
Data anlytics Hospitals Research India.pptx
SayantanChakravorty2
Daily, Weekly, Monthly Report MTC March 2025.pptx
PanjiDewaPamungkas1
brigada_PROGRAM_25.docx the boys white house
RonelNebrao
microservices-with-container-apps-dapr.pptx
vjay22
RESEARCH-FINAL-GROUP-3, about the final .pptx
gwapokoha1
Project_Update_Summary.for the use from PM
Odysseas Lekatsas
Informatics Market Insights AI Workforce.pdf
karizaroxx
Ad

WiredTiger Overview

  • 2. Data Centers are expensive Company Location Data Center Cost Data Center Size MW NSA Camp Williams, UT $2B 133 Apple Maiden, NC $1B 67 Internet Villages Annandale, Scot. $1.6B 107 Lockerbie DC Lockerbie, Scotland $1.5B 100 Social Security Baltimore, MD $400M 27 Next Generation Data Wales, UK $300M 20 Facebook Princeville, OR $215M 15
  • 3. WiredTiger Mission WiredTiger is rethinking data management for modern hardware with a focus on multi-core scalability and maximizing the value of every byte of RAM.
  • 5. A New Data Management Engine ● Architected for modern computer systems ● Scalable and able to handle big data ● High throughput, consistent low latency ● Row-store, column-store, log structured merge ● ACID transactions, standard isolation levels ● Checkpoint and fine-grained durability ● Supporting columns, indices, projections ● Production quality, fully supported ● NoSQL, Open Source
  • 6. Flexible Storage ● Access methods tailored to workload o Row store (read mostly of all columns) o Column store (read mostly of some columns) o Log-structured merge trees (mostly random writes) ● Compact storage format o RLE, key-prefix, dictionary and static compression o Stream compression ● Adapt workload to storage (RAM, SSD, HDD)
  • 7. Flexible Configuration ● API offers a simple key/value store, or ● A complete schema layer o Specify data types o Map columns to files o Automatically maintain indices o Queries only read required columns o Projections, index-only scans ● Checkpoint or fine-grained durability
  • 8. Improved Efficiency ● Higher CPU Utilization o Multi-core scalability o Minimize contention between threads o Non-locking algorithms o Hazard pointers ● Lower Power Costs ● Flash Optimized Block Layout
  • 9. Consistent High Performance ● In-cache or I/O bound ● Workload Configuration o Efficient sparse data (column-store) o Bounded queries and updates (row-store) o Write-optimized (LSM) ● Data structures for access at RAM speed
  • 10. Consistent Low Latency ● Non-locking algorithms ● Multi-versioned data ● Optimistic concurrency control ● Deadlock-free transactions ● I/O shifted to background threads
  • 11. Cost Effective Metric iiBench run cost $6.44 $12.88 Cost per Billion $20.30 $40.60 inserts* ● WiredTiger provides a 50% cost savings for the same AWS workload ● More details on this benchmark are available here.
  • 13. Management Team Keith Bostic is a founder and architect at WiredTiger. He was a founder of Sleepycat Software, (acquired by Oracle Corp. in 2006), and one of the architects of the Berkeley DB, the most widely-used embedded data management software in the world. Mr. Bostic was one of architects of the University of California, Berkeley, 2.10BSD and 4BSD releases, where he lead the 4BSD release Open Source effort. He is the recipient of a USENIX Association Lifetime Achievement Award (The Flame), which recognizes singular contributions to the UNIX community. Dr. Michael Cahill is a founder and architect at WiredTiger. He was an architect of Berkeley DB at Sleepycat Software and Oracle Corp., responsible for design and implementation of multiversion concurrency control, as well as SQL interfaces and programming language APIs. Previously, Dr. Cahill was CTO at Bullant Technology, which grew tenfold and raised over US$30 million from investors including Intel Capital and JP Morgan during his three year tenure. Dr. Cahill’s PhD from the University of Sydney is in the area of transaction processing and concurrency control. His work on a new algorithm for implementing serializable isolation received an ACM SIGMOD Best Paper award and was added to PostgreSQL 9.1.
  • 14. Summary and Next Steps We’d like to discuss how we could help you with your solution. Thanks! Questions? info@wiredtiger.com

Editor's Notes

  • #3: The best number available to estimate the cost of a data center is the number of power supplies: that number determines heating and cooling costs, as well as hardware and software (license units) costs. While the number of CPUs per power supply continues to increase, CPUs are no longer getting faster, and at the data center level we need to look at software efficiencies to gain further scale beyond what the hardware can deliver. For the foreseeable future, multi-core scaling is key to better performance and increased efficiency. Common indexing technology in use today was written for computer architectures of the early 1990s, better software efficiency yields huge benefits
  • #4: WiredTiger is focused on single-node data management in service of high-end applications, improving application scalability and efficiency via software innovation.
  • #5: WiredTiger is entirely focused on single-node resource cost per transaction. WiredTiger does not include data distribution or other horizontal scaling software. WiredTiger is intended for applications running on a single node which require the maximum possible performance from the indexing technology, or as a storage technology for applications supporting their own horizontal scaling solutions.
  • #7: Row-store is a traditional database object, where keys are byte strings and all columns of a row are stored together, best for read-mostly workloads where all columns are equally valuable. Column-store groups columns in storage and only the necessary columns are read to satisfy a query. Log-structured merge trees (LSM) support high-speed random inserts, at the cost of slower reads. WiredTiger supports all three access methods and the access methods can be combined (for example, a sparse, wide table configured with a column-store primary, where indexes are stored in an LSM tree). WiredTiger supports a large number of compression algorithms: RLE: run-length encoding when columns repeat Key-prefix: Btree key-prefix compression Dictionary: unique columns only stored once per write block Static: Huffman encoding Stream: pluggable stream compression (for example, snappy or zlib); because WiredTiger supports variable-length blocks, stream compression can be applied in all cases, unlike engines where compression must operate in block-sized units.
  • #9: Unlike other indexing technologies, for example LevelDB and InnoDB, WiredTiger scales linearly as additional cores are added.
  • #10: iiBench is a standard benchmark used to measure MySQL performance. Compared to InnoDB WiredTiger showed consistently better query rates . . .
  • #11: . . . and much more consistent latency as you scale rows in the data-store.
  • #12: The ultimate benefit to the customer is reduced cost. This chart shows the cost of a billion inserts on an Amazon Web Services instance for the popular engine InnoDB versus WiredTiger: WiredTiger returns twice the performance on a typical AWS instance.