The document summarizes 10gen, the company behind the MongoDB NoSQL database. 10gen has over 170 employees, 500+ customers, and has received $73M in funding from top investors. MongoDB is a leading document-oriented database that is scalable, high-performance, and open source. It supports flexible schemas, horizontal scaling, and replication for high availability. Many large organizations rely on MongoDB for its ability to handle high volumes of data, diverse data types including semi-structured data, and support for agile development processes.
1 of 30
Download to read offline
More Related Content
Morningwithmongodbisrael 121217184113-phpapp02
2. 10gen Overview
10gen is the
company behind
MongoDB
the leading
NoSQL
database
2
7. Global MongoDB Community
41,000+
Monthly Unique Downloads
24,000+
Online Education Registrants
12,000+
MongoDB User Group Members
10,000+
Annual MongoDB Days Attendees
13. Organizations are becoming frustrated using a
RDBMS.
Productivity decreases Productivity
Needed to add new software
layers of ORM, Caching,
Sharding, Message Queue
Polymorphic, semi-structured
and unstructured data not well
supported
Costs Cost of database increases
Vertical, not horizontal, scaling
High cost of SAN
17. MongoDB is a scalable, high-performance NoSQL
database.
Open source, written in C++ Full featured indexes, query
Document-oriented Storage language
Based on JSON Documents Replication & High Availability
Schema-less
Auto-sharding
18. Relational Database Challenges
Data Types Agile Development
Unstructured data Iterative
Semi-structured data Short development cycles
Polymorphic data New workloads
Volume of Data New Architectures
Petabytes of data Horizontal scaling
Trillions of records Commodity servers
Tens of millions of queries per second Cloud computing
18
19. Volume of Data
Volume of Data
Petabytes of data
Trillions of records
Millions of queries per second
19
20. Data Types
{
_id : ObjectId("4c4ba5e5e8aabf3"),
Data Types
employee_name: "Dunham, Justin",
department : "Marketing",
Unstructured data
Semi-structured data
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: C",
benefits : [
{ type : "Health",
Polymorphic data
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
20
21. Agile Development
Agile Development
Iterative
Short development cycles
New workloads
21
22. Problem Why MongoDB Impact
A need to extract value from
A need to extract value from Built around scalability, with
Built around scalability, with Priority Moments project is
Priority Moments project is
existing semi-structured
existing semi-structured auto-sharding features
auto-sharding features a strong success
a strong success
data sources (social
data sources (social mongoDB deployment
mongoDB deployment Subsequent adoption of
Subsequent adoption of
networks etc.)
networks etc.) architecture prevents any
architecture prevents any mongoDB by O2 &
mongoDB by O2 &
A fast-growing customer-
A fast-growing customer- single point of failure
single point of failure Telefonica across a large
Telefonica across a large
base required any solution
base required any solution Geospatial indexing out-of-
Geospatial indexing out-of- number of projects
number of projects
to be easily scalable
to be easily scalable the-box enables location-
the-box enables location-
based service delivery
based service delivery
Selecting MongoDB as our database platform was a no brainer as the technology offered us the flexibility
and scalability that we knew wed need for Priority Moments.
Andrew Pattinson, Head of Online Delivery
23. Problem Why MongoDB Impact
RDBMS architecture
RDBMS architecture Flexible data model allows
Flexible data model allows The Guardian has
The Guardian has
constrained their ability to
constrained their ability to for heterogeneous structure
for heterogeneous structure competitive advantage,
competitive advantage,
absorb upstream
absorb upstream Rich query language
Rich query language through enabling social
through enabling social
contributions from users
contributions from users preserves functionality
preserves functionality conversations through the
conversations through the
New features, competitions
New features, competitions System updates with zero
System updates with zero site
site
needed to log data into user
needed to log data into user downtime
downtime Interactive features can be
Interactive features can be
records, requiring schema
records, requiring schema Ease of use, allowing a large
Ease of use, allowing a large delivered more quickly,
delivered more quickly,
changes
changes development team to adopt
development team to adopt which translates to
which translates to
the technology quickly
the technology quickly increased revenues
increased revenues
Relational databases have a sound approach, but that doesnt necessarily match the way we see our data.
mongoDB gave us the flexibility to store data in the way that we understand it as opposed to somebodys
theoretical view.
Philip Wills, Software Architect
24. New Architectures
New Architectures
Horizontal scaling
Commodity servers
Cloud computing
24
27. Best Total Cost of Ownership
(TCO)
Developer and Ops Savings
Less code
More productive development
Easier to maintain
Hardware Savings
Commodity servers
Internal storage (no SAN)
Scale out, not up
Software and Support Savings
No upfront license pay for value DB Alternative
over time
Cost visibility for usage growth
28. Relational Database Challenges
Data Types Agile Development
Unstructured data Iterative
Semi-structured data Short development cycles
Polymorphic data New workloads
Volume of Data New Architectures
Petabytes of data Horizontal scaling
Trillions of records Commodity servers
Tens of millions of queries per second Cloud computing
28
#5: Note: Growth refers to year-to-date revenue based on our fiscal years for 2011 and 2012, i.e., it compares Feb-Oct 2011 (calendar year) to Feb-Oct 2012 (calendar). These figures are unaudited and subject to change.
#32: A highlight of some key features in 2.4. . . . We ll add more details and more items each month as we work towards a winter release. Security: SASL is a framework for authentication that helps decouple specific authentication mechanisms from client/server implementation. This framework will permit working with a variety of authentication mechanisms, initially we ll build in kerberos. We may add others over time, but SASL implementation will make it much easier for you to add your own without having to implement a new client. Kerberos is quite common, so we ll build that one in first. With additional authentication, we want to take a few steps to separate out activities authorized to various users. Separate read, read/write, security administration, database-specific (compact, validate, etc.), and server/cluster administration (fsync, log rotate, shutdown, create database, etc.). This is just an initial step in our authorization work. Hash-based sharding Apply a hash function to a selected key as the shard key. Evenly spread documents in a sharded cluster. Evenly spread the work associated with queries in a sharded cluster. Will minimize migrations (should only happen when growing a cluster). Note: this is something you can do now, but not automatic. Geospatial index resolution: Talk about challenge of specifying some polygon and finding overlap with another polygon in a document, this becomes interesting for location-aware applications, intelligence community. Replica set flapping: avoid electing a new primary due to a falsely detecting that the current primary went down. Adding mechanisms to reduce false detections. This is good for heavy load and network issues/blips in a data center.
#34: Ok, so here are the presenters notes. Your first job is to add you name and other useful stuff so that your students can contact you afterwards. This is a good time to - introduce yourself - create a seating chart, get each student to say their name, company and what they want to learn... and write it on your seating chart
#36: A highlight of some key features in 2.4. . . . We ll add more details and more items each month as we work towards a winter release. Security: SASL is a framework for authentication that helps decouple specific authentication mechanisms from client/server implementation. This framework will permit working with a variety of authentication mechanisms, initially we ll build in kerberos. We may add others over time, but SASL implementation will make it much easier for you to add your own without having to implement a new client. Kerberos is quite common, so we ll build that one in first. With additional authentication, we want to take a few steps to separate out activities authorized to various users. Separate read, read/write, security administration, database-specific (compact, validate, etc.), and server/cluster administration (fsync, log rotate, shutdown, create database, etc.). This is just an initial step in our authorization work. Hash-based sharding Apply a hash function to a selected key as the shard key. Evenly spread documents in a sharded cluster. Evenly spread the work associated with queries in a sharded cluster. Will minimize migrations (should only happen when growing a cluster). Note: this is something you can do now, but not automatic. Geospatial index resolution: Talk about challenge of specifying some polygon and finding overlap with another polygon in a document, this becomes interesting for location-aware applications, intelligence community. Replica set flapping: avoid electing a new primary due to a falsely detecting that the current primary went down. Adding mechanisms to reduce false detections. This is good for heavy load and network issues/blips in a data center.