際際滷

際際滷Share a Scribd company logo
Best Practices for Migrating
From RDBMS to MongoDB
Sheeri Cabral, Product Manager, Distributed Systems
Safe Harbor Statement
This presentation contains forward-looking statements within the meaning of Section 27A of the
Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as
amended. Such forward-looking statements are subject to a number of risks, uncertainties, assumptions
and other factors that could cause actual results and the timing of certain events to differ materially from
future results expressed or implied by the forward-looking statements. Factors that could cause or
contribute to such differences include, but are not limited to, those identified our filings with the Securities
and Exchange Commission. You should not rely upon forward-looking statements as predictions of future
events. Furthermore, such forward-looking statements speak only as of the date of this presentation.
In particular, the development, release, and timing of any features or functionality described for MongoDB
products remains at MongoDBs sole discretion. This information is merely intended to outline our general
product direction and it should not be relied on in making a purchasing decision nor is this a commitment,
promise or legal obligation to deliver any material, code, or functionality. Except as required by law, we
undertake no obligation to update any forward-looking statements to reflect events or circumstances after
the date of such statements.
Agenda Normalization and MongoDB
Schema Design and Performance
Seamless no-downtime Migration
Q&A
60 minutes
Who am I?
Who am I?
Masters in Computer Science
Who am I?
Sysadmin for 4 years
Masters in Computer Science
Who am I?
Sysadmin for 4 years
MySQL DBA for 14 years
Masters in Computer Science
RDBMS =
Relational Database
Management System
Relation = Table
Best Practices for Migrating RDBMS to MongoDB
row ~ document
table ~ collection
row ~ document
Best Practices for Migrating RDBMS to MongoDB
Best Practices for Migrating RDBMS to MongoDB
Best Practices for Migrating RDBMS to MongoDB
Best Practices for Migrating RDBMS to MongoDB
Best Practices for Migrating RDBMS to MongoDB
What problems
does normalization solve?
What problems
does normalization solve?
Hard to update a
multi-value data cell
Best Practices for Migrating RDBMS to MongoDB
What problems
does normalization solve?
Duplicate data leads to
data integrity
problems when doing updates
Hard to update a
multi-value data cell
What problems
does normalization solve?
Duplicate data leads to
data integrity
problems when doing updates
Hard to update a
multi-value data cell
Duplicate data
wastes resources
Best Practices for Migrating RDBMS to MongoDB
Best Practices for Migrating RDBMS to MongoDB
What problems
does normalization cause?
What problems
does normalization cause?
Transactions
(ACID compliance) more difficult
What problems
does normalization cause?
Joins are expensive
Transactions
(ACID compliance) more difficult
What problems
does normalization cause?
Joins are expensive
Transactions
(ACID compliance) more difficult
Migrations are not convenient
Best Practices for Migrating RDBMS to MongoDB
Data that is accessed together
should be stored together
users
articles
articles
users
users
articles
// Get the user object
> user = db.user.findOne({username: sheeri});
articles
users
// Get the user object
> user = db.user.findOne({username: sheeri});
// Get all the articles linked to the person
> myArticles = db.articles.find({_id: {
$in : people.articles.map(authorId => user._id) } } )
articles
users
// Get the user object
> user = db.user.findOne({username: sheeri});
articles
users
// Get the user object
> user = db.user.findOne({username: sheeri});
// Get all the articles linked to the person
> myArticles = db.articles.find({_id: {
$in : people.articles.map(authorId => user._id) } } )
articles
users
Model the objects that your
application uses
users
articles
articles
users
articles
Extended reference
users
articles
users
articles
users
MongoDB
Relational MongoDB
Data that is accessed together
should be stored together
Relational MongoDB
Relational MongoDB
Thinking in Documents
https://www.mongodb.com/blog/post/thinking-documents-part-1
6 Rules of Thumb for MongoDB Schema Design
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-
schema-design-part-1
Relational MongoDB
What about indexes?
Relational MongoDB
What about indexes?
index index
index
index
index
Relational MongoDB
What about indexes?
What about indexes?
simple = single field
What about indexes?
simple = single field
What about indexes?
compound = multiple fields
multi-key = index for arrays
and nested arrays
simple = single field
What about indexes?
compound = multiple fields
multi-key = index for arrays
and nested arrays
simple = single field
Unique or non-unique
What about indexes?
compound = multiple fields
What about structure?
What about structure?
schema validation
What about structure?
schema validation
require fields
What about structure?
schema validation
require fields
specify data types
including enumerated lists
What about foreign keys?
What about foreign keys?
Do you reallyneed them?
What about foreign keys?
Do you reallyneed them?
App validatesfrom db lookups
What about foreign keys?
Do you reallyneed them?
App validatesfrom db lookups
Why validate again?
What about foreign keys?
Do you reallyneed them?
App validatesfrom db lookups
Why validate again?
How does your app handle failures?
What about foreign keys?
What about foreign keys?
embed for parent/child
What about foreign keys?
embed for parent/child
schema validation and
enum for specific
values
What about foreign keys?
embed for parent/child
schema validation and
enum for specific
values
reference
What about transactions?
What about transactions?
Atomicity
succeeds or fails completely
What about transactions?
Atomicity
succeeds or fails completely
Consistency
db from one valid state to another
What about transactions?
Atomicity
succeeds or fails completely
Consistency
db from one valid state to another
Isolation
how/when changes are seen by ops
What about transactions?
Atomicity
succeeds or fails completely
Consistency
db from one valid state to another
Isolation
how/when changes are seen by ops
Durability
completion is forever
What about transactions?
MongoDB has transactions
across documents, collections, shards, etc.
Relational MongoDB
What about transactions?
Lots of transactions?
Rethink your schema
articles
articles
articles
Data that is accessed together
should be stored together
Data that is accessed together
should be stored together
No downtime
seamless migrations
Change strings to dates
Code application to handle
strings and dates
Change strings to dates
Code application to handle
strings and dates
Change strings to dates
New data stored as dates
Code application to handle
strings and dates
update documents
one at a time
Change strings to dates
New data stored as dates
articles
16Mb document size limit
16Mb document size limit
Hot documents
Activity hot spots
16Mb document size limit
Hot documents
Activity hot spots
Embed = fast access
16Mb document size limit
Hot documents
Activity hot spots
Embed = fast access
Large docs use
more memory
articles
articles
articles
comments
articles
articles
articles
comments
subset
comments
articles
articles
articles
overflow_comments
articles
overflow_comments
outlier
Building a MongoDB
schema
Building a MongoDB
schema
Embed if you can
1:few
Building a MongoDB
schema
Array of references
for separate data
1:many
Embed if you can
1:few
Building a MongoDB
schema
Array of references
for separate data
1:many
Embed if you can
1:few
Reference for unbounded arrays
1:zillion
Schema Patterns
Polymorphic
flexible schema
Schema Patterns
Polymorphic
flexible schema extended
reference
not just _id
Schema Patterns
Polymorphic
flexible schema
subset
part of data is duplicated by embedding
extended
reference
not just _id
Schema Patterns
Polymorphic
flexible schema
subset
part of data is duplicated by embedding
outlier
a few documents will overflow
extended
reference
not just _id
Schema Patterns
Polymorphic
flexible schema
subset
part of data is duplicated by embedding
outlier
a few documents will overflow
Building with Patterns blog series:
https://www.mongodb.com/blog/post/building-with-patterns-a-summary
extended
reference
not just _id
From RDBMSto MongoDB
Documents do not need to have
identical fields
From RDBMSto MongoDB
Documents do not need to have
identical fields
Data that is accessed together
should be stored together
From RDBMSto MongoDB
Documents do not need to have
identical fields
Data that is accessed together
should be stored together
Rethink if you have lots of
references or transactions
Credit, Thanks and Links
Asya Kamsky
Evin Roesle
Nick Larew
Aly Cabral
Wikipedia
Thinking in Documents
https://www.mongodb.com/blog/post/thinking-documents-part-1
6 Rules of Thumb for MongoDB Schema Design
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-
schema-design-part-1
Building with Patterns blog series:
https://www.mongodb.com/blog/post/building-with-patterns-a-summary
Q&A

More Related Content

Best Practices for Migrating RDBMS to MongoDB