ݺߣ

ݺߣShare a Scribd company logo
Schema Design


by Alex Litvinok
Schema Design




   Basic unit of data C Document..
Schema Design

What is document?

? BSON Document
? Embedding
? Links across documents
Schema Design

Example

 01. event = {
 02.     _id:     ObjectId(47cc67093475061e3d95369d),
 03.     name:    MeetUP #2,
 04.     date:    ISODate(2012-04-05 19:00:00'),
 05.     where:   {
 06.              city:     Minsk,
 07.              adress: Nezavisimosti, 186 }
 08. }
Schema Design

RDBMS? @#$.? NoSQL!
Relation DB     Document DB
Database        Database
Table           Collection
Row(s)          Document
Index           Index
Join            Embedding and Links
Partition       Shard
Partition Key   Shard Key
Schema Design

Why?


? Make queries easy and fast

? Facilitate sharding and automaticity
Schema Design

Strategy


? Start with a normalized model

? Embed docs for simplicity and optimization
Schema Design




     Normalized? Denormalized?
Schema Design

Normalized schema
01.   Order = {
02.       _id : orderId,        Order
03.       user : userInfo,      ? _id
04.       items : [
                                ? user
05.             productId1,     ? items *
06.             productId2,
07.             productId3
08.       ]                     Product
09.   }                         ?   _id
10.   Product = {               ?   name
11.       _id: productId,       ?   price
12.       name : name,          ?   desc
13.       price : price,
14.       desc : description   * Link to collection of product
15.   }
Schema Design

Normalized schema


? Normalized documents are a perfectably
  acceptable way to use MongoDB.

? Normalized documents provide maximum
  flexibility.
Schema Design

Links across documents


DBRef
{ $ref : <collname>, $id : <idvalue>[, $db : <dbname>] }

Or simple storage of _id..
Schema Design

Denormalized schema
01.   Order = {
02.       _id : orderId,           Order
03.       user : userInfo,         ? _id
04.       items : [ {
                                   ? user
05.             _id: productId1,   ? items
06.             name : name1,
07.             price : price1       ? _id
                                     ? name
08.       }, {
                                     ? price
09.             _id: productId2,
10.             name : name2,        ? _id
11.             price : price3       ? name
12.       }]                         ? price
13.   }
Schema Design

Denormalized schema

? Embedded documents are good for fast queries.

? The embedded documents always available with
  the parent documents.

? Embedded and nested documents are good for
  storing complex hierarchies.
Schema Design

Embedding documents
01.
      {
02.
          title : "Contributors",
03.
          data: [
04.
              { name: Grover" },
05.
              { name: James", surname: Madison" },
06.
              { surname: Grant" }
07.
          ]
08.
      }
09.
Schema Design




                ..fast queries
Schema Design

Indexes
Basics
> db.collection.ensureIndex({ name:1 });


Indexing on Embedded Fields
> db.collection.ensureIndex({ location.city:1 })


Compound Keys
> db.collection.ensureIndex({ name:1, age:-1 })
Schema Design

Also indexes..
The _id Index
?   Automatically created except capped collection
?   Index is special and cannot be deleted
?   Enforces uniqueness for its keys


Indexing Array Elements
?   Indexes for each element of the array


Compound Keys
?   Direction of the index ( 1 for ascending or -1 for descending )
Schema Design

Again indexes...

Create options
sparse, unique, dropDups, background, v

Geospatial Indexing
> db.places.ensureIndex( { loc : "2d" } )
> db.places.ensureIndex( { loc : "2d" } , { min : -500 , max : 500 } )
> db.places.ensureIndex( { loc : "2d" } , { bits : 26 } )
Schema Design




       Analysis and Optimization
           Profiler | Explain
Schema Design

Database Profiler

Profiling Level
? 0 - Off
? 1 - log slow operations (by default, >100ms is considered slow)
? 2 - log all operations


> db.setProfilingLevel(2);
Schema Design

Database Profiler

Viewing the Data C collection system.profile
> db.system.profile.find()


 { "ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query
test.$cmd ntoreturn:1 reslen:66 nscanned:0 <br>query: { profile: 2 }
nreturned:1 bytes:50" , "millis" : 0}
Schema Design

Explain
> db.collection.find(  ).explain()
{
    cursor : "BasicCursor",
    indexBounds : [ ],
    nscanned : 57594,
    nscannedObjects : 57594,
    nYields : 2 ,
    n:3,
    millis : 108,
    indexOnly : false,
    isMultiKey : false,
    nChunkSkips : 0
}
Schema Design




        From theory to Actions..
Schema Design

Seating plan
                {
                    _id: ObjectId,
                    event_id: ObjectId
                    seats: {
                       A1:1,
                       A2:1,
                       A3:0,
                       
                       H30:0
                    }
                }
Schema Design

Seating plan

{
    _id: {
       event_id: ObjectId,
       seat: C9
    },
    updated: new Date(),
    state: AVALIBLE
}
Schema Design

Feed reader



                ? Users
                ? Feed
                ? Entries
Schema Design

Feed reader

Storage users
{
  _id: ObjectId,
  name: username,
  feeds: [
        ObjectId,
        ObjectId,
        
  ]
}
Schema Design

Feed reader
Storage feeds
{
   _id: ObjectId,
   url: http://bbc.com/news/feed,
   name: BBC News,
   latest: Date(2012-01-10T12:30:13Z),
   enties:[{
        latest: Date(2012-01-10T12:30:13Z),
        title: Bomb kills Somali sport officials,
        description: , 
   }]
}
Schema Design

Some tips

1. Duplicate data for speed, reference data for integrity
2. Try to fetch data in a single query
3. Design documents to be self-sufficient
4. Override _id when you have your own simple, unique id
5. Dont always use an index
Schema Design

Conclusion


?   Embedded docs are good for fast queries
?   Embedded and nested docs are good for storing hierarchies
?   Normalized docs are a most acceptable
Schema Design




                ?   ?   ?   ?

More Related Content

MongoDB Schema Design

  • 2. Schema Design Basic unit of data C Document..
  • 3. Schema Design What is document? ? BSON Document ? Embedding ? Links across documents
  • 4. Schema Design Example 01. event = { 02. _id: ObjectId(47cc67093475061e3d95369d), 03. name: MeetUP #2, 04. date: ISODate(2012-04-05 19:00:00'), 05. where: { 06. city: Minsk, 07. adress: Nezavisimosti, 186 } 08. }
  • 5. Schema Design RDBMS? @#$.? NoSQL! Relation DB Document DB Database Database Table Collection Row(s) Document Index Index Join Embedding and Links Partition Shard Partition Key Shard Key
  • 6. Schema Design Why? ? Make queries easy and fast ? Facilitate sharding and automaticity
  • 7. Schema Design Strategy ? Start with a normalized model ? Embed docs for simplicity and optimization
  • 8. Schema Design Normalized? Denormalized?
  • 9. Schema Design Normalized schema 01. Order = { 02. _id : orderId, Order 03. user : userInfo, ? _id 04. items : [ ? user 05. productId1, ? items * 06. productId2, 07. productId3 08. ] Product 09. } ? _id 10. Product = { ? name 11. _id: productId, ? price 12. name : name, ? desc 13. price : price, 14. desc : description * Link to collection of product 15. }
  • 10. Schema Design Normalized schema ? Normalized documents are a perfectably acceptable way to use MongoDB. ? Normalized documents provide maximum flexibility.
  • 11. Schema Design Links across documents DBRef { $ref : <collname>, $id : <idvalue>[, $db : <dbname>] } Or simple storage of _id..
  • 12. Schema Design Denormalized schema 01. Order = { 02. _id : orderId, Order 03. user : userInfo, ? _id 04. items : [ { ? user 05. _id: productId1, ? items 06. name : name1, 07. price : price1 ? _id ? name 08. }, { ? price 09. _id: productId2, 10. name : name2, ? _id 11. price : price3 ? name 12. }] ? price 13. }
  • 13. Schema Design Denormalized schema ? Embedded documents are good for fast queries. ? The embedded documents always available with the parent documents. ? Embedded and nested documents are good for storing complex hierarchies.
  • 14. Schema Design Embedding documents 01. { 02. title : "Contributors", 03. data: [ 04. { name: Grover" }, 05. { name: James", surname: Madison" }, 06. { surname: Grant" } 07. ] 08. } 09.
  • 15. Schema Design ..fast queries
  • 16. Schema Design Indexes Basics > db.collection.ensureIndex({ name:1 }); Indexing on Embedded Fields > db.collection.ensureIndex({ location.city:1 }) Compound Keys > db.collection.ensureIndex({ name:1, age:-1 })
  • 17. Schema Design Also indexes.. The _id Index ? Automatically created except capped collection ? Index is special and cannot be deleted ? Enforces uniqueness for its keys Indexing Array Elements ? Indexes for each element of the array Compound Keys ? Direction of the index ( 1 for ascending or -1 for descending )
  • 18. Schema Design Again indexes... Create options sparse, unique, dropDups, background, v Geospatial Indexing > db.places.ensureIndex( { loc : "2d" } ) > db.places.ensureIndex( { loc : "2d" } , { min : -500 , max : 500 } ) > db.places.ensureIndex( { loc : "2d" } , { bits : 26 } )
  • 19. Schema Design Analysis and Optimization Profiler | Explain
  • 20. Schema Design Database Profiler Profiling Level ? 0 - Off ? 1 - log slow operations (by default, >100ms is considered slow) ? 2 - log all operations > db.setProfilingLevel(2);
  • 21. Schema Design Database Profiler Viewing the Data C collection system.profile > db.system.profile.find() { "ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0 <br>query: { profile: 2 } nreturned:1 bytes:50" , "millis" : 0}
  • 22. Schema Design Explain > db.collection.find( ).explain() { cursor : "BasicCursor", indexBounds : [ ], nscanned : 57594, nscannedObjects : 57594, nYields : 2 , n:3, millis : 108, indexOnly : false, isMultiKey : false, nChunkSkips : 0 }
  • 23. Schema Design From theory to Actions..
  • 24. Schema Design Seating plan { _id: ObjectId, event_id: ObjectId seats: { A1:1, A2:1, A3:0, H30:0 } }
  • 25. Schema Design Seating plan { _id: { event_id: ObjectId, seat: C9 }, updated: new Date(), state: AVALIBLE }
  • 26. Schema Design Feed reader ? Users ? Feed ? Entries
  • 27. Schema Design Feed reader Storage users { _id: ObjectId, name: username, feeds: [ ObjectId, ObjectId, ] }
  • 28. Schema Design Feed reader Storage feeds { _id: ObjectId, url: http://bbc.com/news/feed, name: BBC News, latest: Date(2012-01-10T12:30:13Z), enties:[{ latest: Date(2012-01-10T12:30:13Z), title: Bomb kills Somali sport officials, description: , }] }
  • 29. Schema Design Some tips 1. Duplicate data for speed, reference data for integrity 2. Try to fetch data in a single query 3. Design documents to be self-sufficient 4. Override _id when you have your own simple, unique id 5. Dont always use an index
  • 30. Schema Design Conclusion ? Embedded docs are good for fast queries ? Embedded and nested docs are good for storing hierarchies ? Normalized docs are a most acceptable
  • 31. Schema Design ? ? ? ?