3. Who am I?
Juan Antonio Roy Couto
? MongoDB Master
? Financial Software Developer
? Email: juanroycouto@gmail.com
? Twitter: @juanroycouto
? Linkedin: https://www.linkedin.com/in/juanroycouto
? 際際滷share: slideshare.net/juanroycouto
? Personal site: http://www.juanroy.es
? Contributor at: http://www.mongodbspain.com
MongoDB Overview
3
4. ? Basic Concepts
? Data Modelling
? Installation Types
? First Steps & CRUD
? Data Analytics With The Aggregation Framework
? Indexing
? Replica Set
? Sharded Cluster
? How To Scale Your App
? Python Driver Overview
Agenda MongoDB Overview
4
5. Basic Concepts - Concepts MongoDB Overview
? High Availability
? Data Safety
? Automatic Failover
? Scalability
5
? Faster development
? Real time analytics
? Better strategic decisions
? Reduce costs and time to
market
8. Basic Concepts - SQL Schema
Design
MongoDB Overview
8
? Customer Key
? First Name
? Last Name
Tables
Customers
? Address Key
? Customer Key
? Street
? Number
? Location
Addresses
? Pet Key
? Customer Key
? Type
? Breed
? Name
Pets
9. Basic Concepts - MongoDB Schema
Design
MongoDB Overview
9
Customers Collection
? Street
? Number
? Location
Addresses
? Type
? Breed
? Name
Pets
Customers Info
? First Name
? Last Name
? Type
? Breed
? Name
11. Data Modelling MongoDB Overview
11
1:1 Employee-Resume
? Access frequency
? Documents size
? Data atomicity
1:N City-Citizen
? Two linked collections
from N to 1
N:N Books-Authors
? Two collections linked via
array
1:Few Post-Comments
? One collection with
embedded data
Limits: 16MB/doc
21. Replica Set
? High Availability
? Data Safety
? Automatic Node Recovery
? Read Preference
? Write Concern
Replica Set
Secondary
Secondary
Primary
MongoDB Overview
21
22. ?Scale out
?Even data distribution across all of the
shards based on a shard key
?A shard key range belongs to only one
shard
?More efficient queries (performance)
Sharded Cluster
Cluster
Shard 0 Shard 2Shard 1
A-I J-Q R-Z
MongoDB Overview
22
23. Sharded Cluster - Config Servers
?config database
?Metadata:
?Cluster shards list
?Data per shard (chunk ranges)
?...
?Replica Set
MongoDB Overview
23
Replica Set
config server
config server
config server
24. ?Receives client requests and returns
results.
?Reads the metadata and sends the
query to the necessary shard/shards.
?Does not store data.
?Keeps a cache version of the
metadata.
Sharded Cluster - mongos MongoDB Overview
24
Replica Set
DRIVER
Secondary
Secondary
Primary
Secondary
Secondary
Primary
mongos
config
server
config server
config server
Shard 0 Shard N-1
25. How To Scale Your App - Shard Key MongoDB Overview
25
?Monotonically Increasing
?Easy divisible?Randomness?Cardinality
26. How To Scale Your App
Sharding a Collection
MongoDB Overview
Shard 0 Shard 1 Shard 2 Shard 3
mongos
Client
Migrations
27. How To Scale Your App - Pre-Splitting MongoDB Overview
27
Useful for storing data directly
in the shards (massive
data loads).
Avoid bottlenecks.
MongoDB does not need to
split or migrate chunks.
After the split, the migration
must be finished before
data loading.
Cluster
Shard 0 Shard 2Shard 1
Chunk 1
Chunk 5
Chunk 3
Chunk 4
Chunk 2
28. How To Scale Your App
Tag-Aware Sharding
MongoDB Overview
28
Tags are used when you want to pin ranges to a specific shard.
shard0
EMEA
shard1
APAC
shard2
LATAM
shard3
NORAM