際際滷

際際滷Share a Scribd company logo
Huelva, 22nd April 2016
Juan Antonio Roy Couto
basicsbasics
#UHUMongoDB
Twitter Hashtag MongoDB Overview
2
Who am I?
Juan Antonio Roy Couto
? MongoDB Master
? Financial Software Developer
? Email: juanroycouto@gmail.com
? Twitter: @juanroycouto
? Linkedin: https://www.linkedin.com/in/juanroycouto
? 際際滷share: slideshare.net/juanroycouto
? Personal site: http://www.juanroy.es
? Contributor at: http://www.mongodbspain.com
MongoDB Overview
3
? Basic Concepts
? Data Modelling
? Installation Types
? First Steps & CRUD
? Data Analytics With The Aggregation Framework
? Indexing
? Replica Set
? Sharded Cluster
? How To Scale Your App
? Python Driver Overview
Agenda MongoDB Overview
4
Basic Concepts - Concepts MongoDB Overview
? High Availability
? Data Safety
? Automatic Failover
? Scalability
5
? Faster development
? Real time analytics
? Better strategic decisions
? Reduce costs and time to
market
Basic Concepts - Products
https://www.mongodb.com/products/overview
MongoDB Overview
6
? Drivers
? Ops & Cloud Manager
? Compass
? Hadoop & Spark connector
? BI connector
? Pluggable Storage Engine API
Basic Concepts - Characteristics
http://www.mongodbspain.com/en/2014/08/17/mongodb-characteristics-future/
MongoDB Overview
7
? Open Source General Purpose NoSQL Database
? Document Oriented
? Non-Structured Data
? Schemaless
? Security (Authentication & Authorization)
? Document Validation, etc
Basic Concepts - SQL Schema
Design
MongoDB Overview
8
? Customer Key
? First Name
? Last Name
Tables
Customers
? Address Key
? Customer Key
? Street
? Number
? Location
Addresses
? Pet Key
? Customer Key
? Type
? Breed
? Name
Pets
Basic Concepts - MongoDB Schema
Design
MongoDB Overview
9
Customers Collection
? Street
? Number
? Location
Addresses
? Type
? Breed
? Name
Pets
Customers Info
? First Name
? Last Name
? Type
? Breed
? Name
Basic Concepts - JSON Document MongoDB Overview
10
> db.customers.findOne()
{
"_id" :
ObjectId("54131863041cd2e6181156ba"),
"first_name" : "Peter",
"last_name" : "Keil",
"address" : {
"street" : "C/Alcal│",
"number" : 123,
"location" : "Madrid",
},
"pets" : [
{
"type" : "Dog",
"breed" :
"Airedale Terrier",
"name" :
"Linda",
},
{
"type" : "Dog",
"breed" :
"Akita",
"name" :
"Bruto",
}
]
Data Modelling MongoDB Overview
11
1:1 Employee-Resume
? Access frequency
? Documents size
? Data atomicity
1:N City-Citizen
? Two linked collections
from N to 1
N:N Books-Authors
? Two collections linked via
array
1:Few Post-Comments
? One collection with
embedded data
Limits: 16MB/doc
Installation Types - Standalone MongoDB Overview
12
MongoDB
Client
DRIVER
Client
DRIVER
Client
DRIVER
Installation Types - Replica Set MongoDB Overview
13
SecondarySecondary
Primary
Client
DRIVER
Client
DRIVER
Client
DRIVER
Replica Set
Installation Types - Sharded Cluster MongoDB Overview
14
Replica Set
Secondary
Secondary
Primary
Client
DRIVER
Client
DRIVER
Client
DRIVER
Secondary
Secondary
Primary
Secondary
Secondary
Primary
Secondary
Secondary
Primary
mongos mongos mongos
config server
config server
config server
Shard 0 Shard 1 Shard 2 Shard N-1
? Find
? Insert
? Bulk inserts
? Massive Data Load
? Update
? Remove
First Steps & CRUD MongoDB Overview
15
Data Analytics with the
Aggregation Framework
MongoDB Overview
16
Data analytics Tools MongoDB Overview
17
? Internals
? Aggregation Framework
? Map Reduce
? Externals
? Spark
? Hadoop
? Tableau (BI)
? ...
MongoDB Overview
18
Indexing - Types
? _id
? Single
? Compound
? Multikey
? Full Text
? GeoSpatial
? Hashed
MongoDB Overview
19
Indexing - Properties
? Unique
? Sparse
? TTL
? Partial
MongoDB Overview
20
Indexing - Improving Your Queries
.explain()
? queryPlanner
? executionStats
? allPlansExecution
Replica Set
? High Availability
? Data Safety
? Automatic Node Recovery
? Read Preference
? Write Concern
Replica Set
Secondary
Secondary
Primary
MongoDB Overview
21
?Scale out
?Even data distribution across all of the
shards based on a shard key
?A shard key range belongs to only one
shard
?More efficient queries (performance)
Sharded Cluster
Cluster
Shard 0 Shard 2Shard 1
A-I J-Q R-Z
MongoDB Overview
22
Sharded Cluster - Config Servers
?config database
?Metadata:
?Cluster shards list
?Data per shard (chunk ranges)
?...
?Replica Set
MongoDB Overview
23
Replica Set
config server
config server
config server
?Receives client requests and returns
results.
?Reads the metadata and sends the
query to the necessary shard/shards.
?Does not store data.
?Keeps a cache version of the
metadata.
Sharded Cluster - mongos MongoDB Overview
24
Replica Set
DRIVER
Secondary
Secondary
Primary
Secondary
Secondary
Primary
mongos
config
server
config server
config server
Shard 0 Shard N-1
How To Scale Your App - Shard Key MongoDB Overview
25
?Monotonically Increasing
?Easy divisible?Randomness?Cardinality
How To Scale Your App
Sharding a Collection
MongoDB Overview
Shard 0 Shard 1 Shard 2 Shard 3
mongos
Client
Migrations
How To Scale Your App - Pre-Splitting MongoDB Overview
27
Useful for storing data directly
in the shards (massive
data loads).
Avoid bottlenecks.
MongoDB does not need to
split or migrate chunks.
After the split, the migration
must be finished before
data loading.
Cluster
Shard 0 Shard 2Shard 1
Chunk 1
Chunk 5
Chunk 3
Chunk 4
Chunk 2
How To Scale Your App
Tag-Aware Sharding
MongoDB Overview
28
Tags are used when you want to pin ranges to a specific shard.
shard0
EMEA
shard1
APAC
shard2
LATAM
shard3
NORAM
Python Driver - Overview MongoDB Overview
29
script1.py
import pymongo
from pymongo import MongoClient
connection = MongoClient(`localhost¨,27017)
db = connection.test
customers = db.customers
item = customers.findOne()
print item[`firstname¨]
$python script1.py
Python Driver - CRUD MongoDB Overview
30
PyMongo Server
Finding
find find
find_one findOne
Inserting
insert_one insert
insert_many bulk
Updating
update_one update
update_many update
replace_one update
Deleting
delete_one remove
delete_many remove
Python Driver - CRUD Examples MongoDB Overview
31
Insert
pedro = { `firstname¨:`Pedro¨, `lastname¨:`Garc┴a¨ }
maria = { `firstname¨:`Mar┴a¨, `lastname¨:`P└rez¨ }
doc = [ pedro, maria ]
customers.insert_many([doc])
Update
customers.update_one({`_id¨:customer_id},
{$set:{`city¨:`Huelva¨}})
Remove
customers.delete_one( { `_id¨ : customer_id } )
Python Driver - Cursors And Exceptions MongoDB Overview
32
import pymongo
import sys
from pymongo import MongoClient
connection = MongoClient(`localhost¨,27017)
db = connection.test
customers = db.customers
query = { `firstname¨ : `Juan¨ }
projection = { `city¨ : 1, `_id¨ : 0 }
try:
cursor =
customers.find(query,projection)
exception Exception as e:
print `Unexpected error: `, type(e), e
for doc in cursor:
print doc[`city¨]
Resources MongoDB Basics
33
? Official MongoDB Documentation
? https://docs.mongodb.org/manual/
? Posts via MongoDB Spain
? http://www.mongodbspain.com/en/
? http://www.mongodbspain.com/es/
? Cheat Sheet
? http://www.mongodbspain.com/es/2014/03/23/mongodb-cheat-sheet-
quick-reference/
? The Little MongoDB Book
? http://openmymind.net/mongodb.pdf
Questions?
Questions? MongoDB Basics
34
Thank you for your attention!
MongoDB Workshop
Huelva, 22nd April 2016
Juan Antonio Roy Couto

More Related Content

MongoDB Workshop Universidad de Huelva

  • 1. Huelva, 22nd April 2016 Juan Antonio Roy Couto basicsbasics
  • 3. Who am I? Juan Antonio Roy Couto ? MongoDB Master ? Financial Software Developer ? Email: juanroycouto@gmail.com ? Twitter: @juanroycouto ? Linkedin: https://www.linkedin.com/in/juanroycouto ? 際際滷share: slideshare.net/juanroycouto ? Personal site: http://www.juanroy.es ? Contributor at: http://www.mongodbspain.com MongoDB Overview 3
  • 4. ? Basic Concepts ? Data Modelling ? Installation Types ? First Steps & CRUD ? Data Analytics With The Aggregation Framework ? Indexing ? Replica Set ? Sharded Cluster ? How To Scale Your App ? Python Driver Overview Agenda MongoDB Overview 4
  • 5. Basic Concepts - Concepts MongoDB Overview ? High Availability ? Data Safety ? Automatic Failover ? Scalability 5 ? Faster development ? Real time analytics ? Better strategic decisions ? Reduce costs and time to market
  • 6. Basic Concepts - Products https://www.mongodb.com/products/overview MongoDB Overview 6 ? Drivers ? Ops & Cloud Manager ? Compass ? Hadoop & Spark connector ? BI connector ? Pluggable Storage Engine API
  • 7. Basic Concepts - Characteristics http://www.mongodbspain.com/en/2014/08/17/mongodb-characteristics-future/ MongoDB Overview 7 ? Open Source General Purpose NoSQL Database ? Document Oriented ? Non-Structured Data ? Schemaless ? Security (Authentication & Authorization) ? Document Validation, etc
  • 8. Basic Concepts - SQL Schema Design MongoDB Overview 8 ? Customer Key ? First Name ? Last Name Tables Customers ? Address Key ? Customer Key ? Street ? Number ? Location Addresses ? Pet Key ? Customer Key ? Type ? Breed ? Name Pets
  • 9. Basic Concepts - MongoDB Schema Design MongoDB Overview 9 Customers Collection ? Street ? Number ? Location Addresses ? Type ? Breed ? Name Pets Customers Info ? First Name ? Last Name ? Type ? Breed ? Name
  • 10. Basic Concepts - JSON Document MongoDB Overview 10 > db.customers.findOne() { "_id" : ObjectId("54131863041cd2e6181156ba"), "first_name" : "Peter", "last_name" : "Keil", "address" : { "street" : "C/Alcal│", "number" : 123, "location" : "Madrid", }, "pets" : [ { "type" : "Dog", "breed" : "Airedale Terrier", "name" : "Linda", }, { "type" : "Dog", "breed" : "Akita", "name" : "Bruto", } ]
  • 11. Data Modelling MongoDB Overview 11 1:1 Employee-Resume ? Access frequency ? Documents size ? Data atomicity 1:N City-Citizen ? Two linked collections from N to 1 N:N Books-Authors ? Two collections linked via array 1:Few Post-Comments ? One collection with embedded data Limits: 16MB/doc
  • 12. Installation Types - Standalone MongoDB Overview 12 MongoDB Client DRIVER Client DRIVER Client DRIVER
  • 13. Installation Types - Replica Set MongoDB Overview 13 SecondarySecondary Primary Client DRIVER Client DRIVER Client DRIVER Replica Set
  • 14. Installation Types - Sharded Cluster MongoDB Overview 14 Replica Set Secondary Secondary Primary Client DRIVER Client DRIVER Client DRIVER Secondary Secondary Primary Secondary Secondary Primary Secondary Secondary Primary mongos mongos mongos config server config server config server Shard 0 Shard 1 Shard 2 Shard N-1
  • 15. ? Find ? Insert ? Bulk inserts ? Massive Data Load ? Update ? Remove First Steps & CRUD MongoDB Overview 15
  • 16. Data Analytics with the Aggregation Framework MongoDB Overview 16
  • 17. Data analytics Tools MongoDB Overview 17 ? Internals ? Aggregation Framework ? Map Reduce ? Externals ? Spark ? Hadoop ? Tableau (BI) ? ...
  • 18. MongoDB Overview 18 Indexing - Types ? _id ? Single ? Compound ? Multikey ? Full Text ? GeoSpatial ? Hashed
  • 19. MongoDB Overview 19 Indexing - Properties ? Unique ? Sparse ? TTL ? Partial
  • 20. MongoDB Overview 20 Indexing - Improving Your Queries .explain() ? queryPlanner ? executionStats ? allPlansExecution
  • 21. Replica Set ? High Availability ? Data Safety ? Automatic Node Recovery ? Read Preference ? Write Concern Replica Set Secondary Secondary Primary MongoDB Overview 21
  • 22. ?Scale out ?Even data distribution across all of the shards based on a shard key ?A shard key range belongs to only one shard ?More efficient queries (performance) Sharded Cluster Cluster Shard 0 Shard 2Shard 1 A-I J-Q R-Z MongoDB Overview 22
  • 23. Sharded Cluster - Config Servers ?config database ?Metadata: ?Cluster shards list ?Data per shard (chunk ranges) ?... ?Replica Set MongoDB Overview 23 Replica Set config server config server config server
  • 24. ?Receives client requests and returns results. ?Reads the metadata and sends the query to the necessary shard/shards. ?Does not store data. ?Keeps a cache version of the metadata. Sharded Cluster - mongos MongoDB Overview 24 Replica Set DRIVER Secondary Secondary Primary Secondary Secondary Primary mongos config server config server config server Shard 0 Shard N-1
  • 25. How To Scale Your App - Shard Key MongoDB Overview 25 ?Monotonically Increasing ?Easy divisible?Randomness?Cardinality
  • 26. How To Scale Your App Sharding a Collection MongoDB Overview Shard 0 Shard 1 Shard 2 Shard 3 mongos Client Migrations
  • 27. How To Scale Your App - Pre-Splitting MongoDB Overview 27 Useful for storing data directly in the shards (massive data loads). Avoid bottlenecks. MongoDB does not need to split or migrate chunks. After the split, the migration must be finished before data loading. Cluster Shard 0 Shard 2Shard 1 Chunk 1 Chunk 5 Chunk 3 Chunk 4 Chunk 2
  • 28. How To Scale Your App Tag-Aware Sharding MongoDB Overview 28 Tags are used when you want to pin ranges to a specific shard. shard0 EMEA shard1 APAC shard2 LATAM shard3 NORAM
  • 29. Python Driver - Overview MongoDB Overview 29 script1.py import pymongo from pymongo import MongoClient connection = MongoClient(`localhost¨,27017) db = connection.test customers = db.customers item = customers.findOne() print item[`firstname¨] $python script1.py
  • 30. Python Driver - CRUD MongoDB Overview 30 PyMongo Server Finding find find find_one findOne Inserting insert_one insert insert_many bulk Updating update_one update update_many update replace_one update Deleting delete_one remove delete_many remove
  • 31. Python Driver - CRUD Examples MongoDB Overview 31 Insert pedro = { `firstname¨:`Pedro¨, `lastname¨:`Garc┴a¨ } maria = { `firstname¨:`Mar┴a¨, `lastname¨:`P└rez¨ } doc = [ pedro, maria ] customers.insert_many([doc]) Update customers.update_one({`_id¨:customer_id}, {$set:{`city¨:`Huelva¨}}) Remove customers.delete_one( { `_id¨ : customer_id } )
  • 32. Python Driver - Cursors And Exceptions MongoDB Overview 32 import pymongo import sys from pymongo import MongoClient connection = MongoClient(`localhost¨,27017) db = connection.test customers = db.customers query = { `firstname¨ : `Juan¨ } projection = { `city¨ : 1, `_id¨ : 0 } try: cursor = customers.find(query,projection) exception Exception as e: print `Unexpected error: `, type(e), e for doc in cursor: print doc[`city¨]
  • 33. Resources MongoDB Basics 33 ? Official MongoDB Documentation ? https://docs.mongodb.org/manual/ ? Posts via MongoDB Spain ? http://www.mongodbspain.com/en/ ? http://www.mongodbspain.com/es/ ? Cheat Sheet ? http://www.mongodbspain.com/es/2014/03/23/mongodb-cheat-sheet- quick-reference/ ? The Little MongoDB Book ? http://openmymind.net/mongodb.pdf
  • 35. Thank you for your attention! MongoDB Workshop Huelva, 22nd April 2016 Juan Antonio Roy Couto