20121023 mongodb schema-design

MongoDB
MongoDBMarketing Ninja um MongoDB
SCHEMA DESIGN WORKSHOP
Jeremy Mikola
  @jmikola
AGENDA
1.   Basic schema design principles for MongoDB
2.   Schema design over an application's lifetime
3.   Common design patterns
4.   Sharding
GOALS
Learn the schema design process in MongoDB
Practice applying common principles via exercises
Understand the implications of sharding
WHAT IS A SCHEMA AND WHY IS IT
          IMPORTANT?
SCHEMA
Map concepts and relationships to data
Set expectations for the data
Minimize overhead of iterative modifications
Ensure compatibility
NORMALIZATION
users    ←   books    →   authors
uenm
 srae        tte
              il          frtnm
                           is_ae
frtnm
 is_ae       ib
              sn          ls_ae
                           atnm
ls_ae
 atnm        lnug
              agae
             cetdb
              rae_y
             ato
              uhr
DENORMALIZATION
users   ←    books
uenm
srae        tte
            il
frtnm
is_ae       ib
            sn
ls_ae
atnm        lnug
            agae
            cetdb
            rae_y
            ato
            uhr
             frtnm
              is_ae
             ls_ae
              atnm
WHAT IS SCHEMA DESIGN LIKE IN
         MONGODB?
 Schema is defined at the application-level
 Design is part of each phase in its lifetime
 There is no magic formula
MONGODB DOCUMENTS
        Storage in BSON → BSONSpec.org

Scalars                     Rich types
  Doubles                     Objects
  Integers (32 or 64-bit)     Arrays
  UTF-8 strings
  UTC Date, timestamp
  Binary, regex, code
  Object ID
  nlul
TERMINOLOGY
{
  "ogd"   rltoa b,
   mnob  :"eainld"
  "aaae   dtbs"
   dtbs" :"aaae,
  "olcin  tbe,
   cleto":"al"
  "ouet   rw,
   dcmn" :"o"
  "ne"    idx,
   idx   :"ne"
  "hrig  
   sadn":{
    "hr"  :"atto"
     sad    priin,
    "hr e":"atto e"
     sadky  priinky
  }
   
}
THREE CONSIDERATIONS IN MONGODB
         SCHEMA DESIGN
 1. The data your application needs
 2. Your application's read usage of the data
 3. Your application's write usage of the data
CASE STUDY
LIBRARY WEB APPLICATION
 Different schemas are possible
AUTHOR SCHEMA
{
  "i" n,
   _d:it
  "is_ae:srn,
   frtnm" tig
  "atnm" tig
   ls_ae:srn
}
USER SCHEMA
{
  "i" n,
   _d:it
  "srae:srn,
   uenm" tig
  "asod:srn
   pswr" tig
}
BOOK SCHEMA
{
  "i" n,
   _d:it
  "il" tig
   tte:srn,
  "lg:srn,
   su" tig
  "uhr:it
   ato" n,
  "vial" ola,
   aalbe:boen
  "sn:srn,
   ib" tig
  "ae" n,
   pgs:it
  "ulse" 
   pbihr:{
    "iy:srn,
     ct" tig
    "ae:dt,
     dt" ae
    "ae:srn
     nm" tig
  }
   ,
  "ujcs:[srn,srn ,
   sbet"  tig tig]
  "agae:srn,
   lnug" tig
  "eiw" 
   rves:[
     ue" n,"et:srn ,
    {"sr:it tx" tig}
     ue" n,"et:srn 
    {"sr:it tx" tig}
  ]
   ,
}
EXAMPLE DOCUMENTS
AUTHOR DOCUMENT
>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead
}
USER DOCUMENT
>d.sr.idn(
  buesfnOe)
{
  _d ,
   i:1
  uenm:"ml@0e.o"
   srae eiy1gncm,
  pswr:"ljkok429ld9098d
   asod ssf4d8k0dkj0023"
}
BOOK DOCUMENT
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
EMBEDDED OBJECTS
      AKA EMBEDDED OR SUB-DOCUMENTS
  What advantages do they have?
   When should they be used?
EMBEDDED OBJECTS
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
EMBEDDED OBJECTS
Great for read performance
One seek to load the entire document
One round trip to the database
Writes can be slow if constantly adding to objects
LINKED DOCUMENTS
What advantages does this approach have?
      When should they be used?
LINKED DOCUMENTS
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    pbihrnm:"vrmnsLbay,
     ulse_ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    pbihrct:"odn
     ulse_iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
LINKED DOCUMENTS
More, smaller documents
Can make queries by ID very simple
Accessing linked document data requires extra read
What effect does this have on the system?
DATA, RAM AND DISK
20121023 mongodb schema-design
20121023 mongodb schema-design
20121023 mongodb schema-design
20121023 mongodb schema-design
ARRAYS
When should they be used?
ARRAY OF SCALARS
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
ARRAY OF OBJECTS
 d.ok.idn(
  bbosfnOe)
{ _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   ,
}
EXERCISE #1
 Design a schema for users and their book reviews

Users                     Reviews
  username (string)         text (string)
  email (string)            rating (integer)
                            created_at (date)
            Usernames are immutable
EXERCISE #1: SOLUTION A
     Reviews may be queried by user or book
/ bues(n ouetprue)
 /d.sr oedcmn e sr
{ _d betd"",
   i:OjcI(…)
  uenm:"o"
   srae bb,
  eal bbeapecm
   mi:"o@xml.o"
}



/ brves(n ouetprrve)
 /d.eiw oedcmn e eiw
{ _d betd"",
   i:OjcI(…)
  ue:OjcI(…)
   sr betd"",
  bo:OjcI(…)
   ok betd"",
  rtn:5
   aig ,
  tx:"hsbo seclet"
   et Ti oki xeln!,
  cetda:IOae"021‐02:40.9Z)
   rae_t SDt(21‐01T11:706"
}
EXERCISE #1: SOLUTION B
       Optimized to retrieve reviews by user
/ bues(n ouetprue ihalrves
 /d.sr oedcmn e srwt l eiw)
{ _d betd"",
   i:OjcI(…)
  uenm:"o"
   srae bb,
  eal bbeapecm,
   mi:"o@xml.o"
  rves 
   eiw:[
    { bo:OjcI(…)
       ok betd"",
      rtn:5
       aig ,
      tx:"hsbo seclet"
       et Ti oki xeln!,
      cetda:IOae"021‐02:40.9Z)
       rae_t SDt(21‐01T11:706"
    }
     
  ]
   
}
EXERCISE #1: SOLUTION C
      Optimized to retrieve reviews by book
/ bues(n ouetprue)
 /d.sr oedcmn e sr
{ _d betd"",
   i:OjcI(…)
  uenm:"o"
   srae bb,
  eal bbeapecm
   mi:"o@xml.o"
}



/ bbos(n ouetprbo ihalrves
 /d.ok oedcmn e okwt l eiw)
{ _d betd"",
   i:OjcI(…)
  / te okfed…
   /Ohrbo ils
  rves 
   eiw:[
    { ue:OjcI(…)
       sr betd"",
      rtn:5
       aig ,
      tx:"hsbo seclet"
       et Ti oki xeln!,
      cetda:IOae"021‐02:40.9Z)
       rae_t SDt(21‐01T11:706"
    }
     
  ]
   
}
SCHEMA DESIGN OVER AN APPLICATION'S
             LIFETIME
           Development
           Production
           Iterative Modifications
DEVELOPMENT PHASE
    Basic CRUD functionality
CREATERUD
 ato  
 uhr={
  _d ,
  i:2
  frtnm:"rhr,
  is_ae Atu"
  ls_ae Mle"
  atnm:"ilr
 }
 ;

 d.uhr.netato)
 batosisr(uhr;




 The _ d
      i field is unique and automatically indexed
 MongoDB will generate an ObjectId if not provided
CREADUD
>d.uhr.id{"atnm" Mle"}
  batosfn( ls_ae:"ilr )
{
  _d ,
   i:2
  frtnm:"rhr,
   is_ae Atu"
  ls_ae Mle"
   atnm:"ilr
}
READS AND INDEXING
    Examine the query after creating an index.
>d.ok.nuene( su"  )
 bbosesrIdx{"lg:1}

>d.ok.id{"lg:"h‐ra‐asy )epan)
  bbosfn( su" tegetgtb"}.xli(
{
  "usr:"teCro lg1,
   cro" Breusrsu_"
  "sutKy  as,
   iMlie":fle
  "":1
   n  ,
  "sandbet":1
   ncneOjcs  ,
  "sand  ,
   ncne":1
  "cnnOdr  as,
   saAdre":fle
  "neOl":fle
   idxny  as,
  "Yed":0
   nils  ,
  "Cukkp":0
   nhnSis  ,
  "ils  ,
   mli":0
  / te ilsflo…
   /Ohrfed olw
}
MULTI-KEY INDEXES
          Index all values in an array field.
 >d.ok.nuene( sbet"  )
  bbosesrIdx{"ujcs:1};
INDEXING EMBEDDED FIELDS
         Index an embedded object's field.
  
 >d.ok.nuene( pbihrnm"  )
   bbosesrIdx{"ulse.ae:1} 
QUERY OPERATORS
Conditional operators
  $ t$ t , $ t$ t , $ e$ l , $ n$ i , $ i e
   g, ge l, le n, al i, nn sz,
  $ n , $ r$ o , $ o , $ y e$ x s s
   ad o, nr md tp, eit
Regular expressions
Value in an array
  $lmac
   eeMth
Cursor methods and modifiers
  c u t )l m t )s i ( , s a s o ( , s r ( ,
   on(, ii(, kp) npht) ot)
   acSz(, xli(, it)
  b t h i e )e p a n )h n (
CRUPDATED
 rve  
 eiw={
  ue:1
  sr ,
  tx:" i O ieti ok"
  et IddNTlk hsbo.
 }
 ;

 d.ok.pae
 bbosudt(
  {_d  ,
   i:1}
  {$uh  eiw:rve }
   ps:{rves eiw}
 )
 ;
ATOMIC MODIFIERS
Update specific fields within a document
           $e, $ne
             st ust
             ps, psAl
           $ u h$ u h l
             adoe, pp
           $ d T S t$ o
             pl, plAl
           $ u l$ u l l
           $eae
             rnm
           $ibt
CRUDELETE
 >d.ok.eoe{_d  )
  bbosrmv( i:1}
PRODUCTION PHASE
Evolve schema to meet the application's read and write
                     patterns
READ USAGE
      Finding books by an author's first name
 atos=d.uhr.id{frtnm:/f*i}  i:1};
 uhr  batosfn( is_ae ^./ ,{_d  )

 atoIs=atosmpfnto(){rtr .i;};
 uhrd  uhr.a(ucinx  eunx_d )

 d.ok.id{uhr  i:atoIs})
 bbosfn(ato:{$n uhrd };
READ USAGE
"Cache" the author name in an embedded document
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  ato:{
   uhr 
    frtnm:".Sot,
     is_ae F ct"
    ls_ae Ftgrl"
     atnm:"izead
  }
   
  / te ilsflo…
   /Ohrfed olw
}




              Queries are now one step
 >d.ok.id{ato.is_ae ^./ )
  bbosfn( uhrfrtnm:/f*i}
WRITE USAGE
              Users can review a book
rve  
 eiw={
  ue:1
   sr ,
  tx:" huh hsbo a ra!,
   et Itogtti okwsget"
  rtn:5
   aig 
};

 >d.ok.pae
   bbosudt(
  {_d  ,
    i:3}
  {$uh  eiw:rve }
    ps:{rves eiw}
);




 Document size limit (16MB)
 Storage fragmentation after many updates/deletes
EXERCISE #2
Display the 10 most recent reviews by a user
Make efficient use of memory and disk seeks
EXERCISE #2: SOLUTION
      Store users' reviews in monthly buckets
/ brves(n ouetprue e ot)
 /d.eiw oedcmn e srprmnh
{ _d bb211"
   i:"o‐020,
  rves 
   eiw:[
    { _d betd"",
       i:OjcI(…)
      rtn:5
       aig ,
      tx:"hsbo seclet"
       et Ti oki xeln!,
      cetda:IOae"021‐02:40.9Z)
       rae_t SDt(21‐01T11:706"
    }
     ,
    { _d betd"",
       i:OjcI(…)
      rtn:2
       aig ,
      tx:" intral no hsbo.,
       et Idd' elyejyti ok"
      cetda:IOae"021‐12:25.9Z)
       rae_t SDt(21‐01T01:054"
    }
     
  ]
   
}
EXERCISE #2: SOLUTION
  Adding a new review to the appropriate bucket
mRve  
 yeiw={
  _d betd"",
   i:OjcI(…)
  rtn:3
   aig ,
  tx:"naeaera.,
   et A vrg ed"
  cetda:IOae"021‐31:61.0Z)
   rae_t SDt(21‐01T22:152"
};

>d.eiw.pae
  brvesudt(
   {_d bb21‐0 ,
     i:"o‐021"}
   {$uh  eiw:mRve }
     ps:{rves yeiw}
);
EXERCISE #2: SOLUTION
   Display the 10 most recent reviews by a user
cro  brvesfn(
usr=d.eiw.id
  {_d ^o‐ ,
   i:/bb/}
  {rves  sie 0}
   eiw:{$lc:1 }
)sr( i:‐ )
.ot{_d 1};

nm=0
u  ;

wie(usrhset)& u  0 
hl cro.aNx( &nm<1){
  dc=cro.et)
  o  usrnx(;

  fr(a   ;i<dcrveslnt &nm<1;+i +u){
   o vri=0   o.eiw.egh& u  0 +,+nm 
    pitsndcrvesi)
     rnjo(o.eiw[];
  }
   
}
EXERCISE #2: SOLUTION
                  Deleting a review
cro  brvesudt(
 usr=d.eiw.pae
  {_d bb21‐0 ,
    i:"o‐021"}
  {$ul  eiw:{_d betd"" }
    pl:{rves  i:OjcI(…)}}
);
ITERATIVE
MODIFICATIONS
 Schema design is evolutionary
ALLOW USERS TO BROWSE BY BOOK
            SUBJECT
>d.ujcsfnOe)
  bsbet.idn(
{
  _d ,
   i:1
  nm:"mrcnLtrtr"
   ae Aeia ieaue,
  sbctgr:{
   u_aeoy 
     ae 12s,
     nm:"90"
     u_aeoy  ae Jz g"}
     sbctgr:{nm:"azAe 
  
  }
}




   How can you search this collection?
   Be aware of document size limitations
   Benefit from hierarchy being in same document
TREE STRUCTURES
>d.ujcsfn(
 bsbet.id)
{ _d Aeia ieaue 
  i:"mrcnLtrtr"}

{ _d:"90"
   i  12s,
  acsos "mrcnLtrtr",
   netr:[Aeia ieaue]
  prn:"mrcnLtrtr"
   aet Aeia ieaue
}

{ _d Jz g"
   i:"azAe,
  acsos "mrcnLtrtr" 12s]
   netr:[Aeia ieaue,"90",
  prn:"90"
   aet 12s
}

{ _d Jz g nNwYr"
   i:"azAei e ok,
  acsos "mrcnLtrtr" 12s,"azAe]
   netr:[Aeia ieaue,"90" Jz g",
  prn:"azAe
   aet Jz g"
}
TREE STRUCTURES
       Find sub-categories of a given subject
>d.ujcsfn( netr:"90"}
  bsbet.id{acsos 12s )
{
  _d Jz g"
   i:"azAe,
  acsos "mrcnLtrtr" 12s]
   netr:[Aeia ieaue,"90",
  prn:"90"
   aet 12s
}

{
  _d Jz g nNwYr"
   i:"azAei e ok,
  acsos "mrcnLtrtr" 12s,"azAe]
   netr:[Aeia ieaue,"90" Jz g",
  prn:"azAe
   aet Jz g"
}
EXERCISE #3
Allow users to borrow library books
   User sends a loan request
   Library approves or not
   Requests time out after seven days
Approval process is asynchronous
Requests may be prioritized
EXERCISE #3: SOLUTION
           Need to maintain order and state
           Ensure that updates are atomic
/ raeanwla eus
/Cet  e onrqet
>d.on.net{
 blasisr(
  _d  orwr bb,bo:OjcI(…)}
  i:{broe:"o" ok betd"" ,
  pnig as,
  edn:fle
  apoe:fle
  prvd as,
  pirt:1
  roiy ,
};
)

/ idtehgetpirt eus n aka edn prvl
/Fn h ihs roiyrqetadmr spnigapoa
rqet=d.on.idnMdf(
eus  blasfnAdoiy{
  qey  edn:fle}
  ur:{pnig as ,
  sr:{pirt:‐ ,
  ot  roiy 1}
  udt:{$e:{pnig re tre:nwIOae)},
  pae  st  edn:tu,satd e SDt( }
  nw re
  e:tu
};
)
EXERCISE #3: SOLUTION
           Updated and added fields
           Modified document was returned
{
  _d  orwr bb,bo:OjcI(…)}
   i:{broe:"o" ok betd"" ,
  pnig re
   edn:tu,
  apoe:fle
   prvd as,
  pirt:1
   roiy ,
  satd SDt(21‐01T20:252"
   tre:IOae"021‐12:94.4Z)
}
EXERCISE #3: SOLUTION
/ irr prvstela eus
 /Lbayapoe h onrqet
>d.on.pae
  blasudt(
  {_d  orwr bb,bo:OjcI(…)},
    i:{broe:"o" ok betd"" }
  {$e:{pnig as,apoe:tu }
    st  edn:fle prvd re}
);
EXERCISE #3: SOLUTION
/ eus ie u fe ee as
/Rqettmsotatrsvndy
lmt=nwDt(;
ii  e ae)
lmtstaelmtgtae)‐7;
ii.eDt(ii.eDt(  )

>d.on.pae
  blasudt(
  {pnig re tre:{$t ii }
    edn:tu,satd  l:lmt},
  {$e:{pnig as,apoe:fle}
    st  edn:fle prvd as }
);
EXERCISE #4
   Allow users to recommend books
Users can recommend each book only once
Display a book's current recommendations
EXERCISE #4: SOLUTION
/ brcmedtos(n ouetprue e ok
/d.eomnain oedcmn e srprbo)
>d.eomnain.net{
 brcmedtosisr(
  bo:OjcI(…)
  ok betd"",
  ue:OjcI(…)
  sr betd""
};
)

/ nqeidxesrsuescntrcmedtie
 /Uiu ne nue sr a' eomn wc
>d.eomnain.nuene(
  brcmedtosesrIdx
  {bo:1 sr  ,
    ok ,ue:1}
  {uiu:tu 
    nqe re}
);

/ on h ubro eomnain o  ok
/Cuttenme frcmedtosfrabo
>d.eomnain.on( ok betd"" )
 brcmedtoscut{bo:OjcI(…)};
EXERCISE #4: SOLUTION
        Indexes in MongoDB are not counting
        Counts are computed via index scans
        Denormalize totals on books
>d.ok.pae
 bbosudt(
  {_d betd"" ,
   i:OjcI(…)}
  {$n:{rcmedtos  }
   ic  eomnain:1}
};
)
COMMON DESIGN
  PATTERNS
ONE-TO-ONE
       RELATIONSHIP
Let's pretend that authors only write one book.
LINKING
  Either side, or both, can track the relationship.
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  / te ilsflo…
   /Ohrfed olw
}

>d.uhr.idn( i:1}
  batosfnOe{_d  )
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead
  bo:1
   ok ,
}
EMBEDDED OBJECT
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:{
   uhr 
    frtnm:".Sot,
     is_ae F ct"
    ls_ae Ftgrl"
     atnm:"izead
  }
   
  / te ilsflo…
   /Ohrfed olw
}
ONE-TO-MANY
     RELATIONSHIP
In reality, authors may write multiple books.
ARRAY OF ID'S
       The "one" side tracks the relationship.
>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead,
  bos 1 ,2]
   ok:[,3 0
}




     Flexible and space-efficient
     Additional query needed for non-ID lookups
SINGLE FIELD WITH ID
      The "many" side tracks the relationship.
>d.ok.id{ato:1}
  bbosfn( uhr  )
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  / te ilsflo…
   /Ohrfed olw
}

{
  _d ,
   i:3
  tte Ti ieo aaie,
   il:"hsSd fPrds"
  su:"707473‐hssd‐fprds"
   lg 9869428ti‐ieo‐aaie,
  ato:1
   uhr ,
  / te ilsflo…
   /Ohrfed olw
}
ARRAY OF OBJECTS
>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead,
  bos 
   ok:[
    {_d ,tte TeGetGtb"}
      i:1 il:"h ra asy ,
    {_d ,tte Ti ieo aaie 
      i:3 il:"hsSd fPrds"}
  ]
   
  / te ilsflo…
   /Ohrfed olw
}




 Use $ l c operator to return a subset of books
      sie
MANY-TO-MANY
 RELATIONSHIP
Some books may also have co-authors.
ARRAY OF ID'S ON BOTH SIDES
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  atos 1 ]
   uhr:[,5
  / te ilsflo…
   /Ohrfed olw
}



>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead,
  bos 1 ,2]
   ok:[,3 0
}
ARRAY OF ID'S ON BOTH SIDES
       Query for all books by a given author
>d.ok.id{atos  )
 bbosfn( uhr:1};




       Query for all authors of a given book
>d.uhr.id{bos  )
 batosfn( ok:1};
ARRAY OF ID'S ON ONE SIDE
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  atos 1 ]
   uhr:[,5
  / te ilsflo…
   /Ohrfed olw
}



>d.uhr.idn( i:{$n 1 ]})
  batosfnOe{_d  i:[,5 }
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead
}

{
  _d ,
   i:5
  frtnm:"nnw"
   is_ae Ukon,
  ls_ae C‐uhr
   atnm:"oato"
}
ARRAY OF ID'S ON ONE SIDE
        Query for all books by a given author
 >d.ok.id{atos  )
  bbosfn( uhr:1};




        Query for all authors of a given book
bo  bbosfnOe
 ok=d.ok.idn(
  {tte TeGetGtb"}
    il:"h ra asy ,
  {atos  
    uhr:1}
);

d.uhr.id{_d  i:bo.uhr };
batosfn( i:{$n okatos})
EXERCISE #5
Tracking time series data
  Graph recommendations per unit of time
  Count by: day, hour, minute
EXERCISE #5: SOLUTION A
/ brct tm eisbces oradmnt u‐os
/d.e_s(iesre ukt,hu n iuesbdc)
>d.e_sisr(
 brct.net{
  bo:OjcI(…)
  ok betd"",
  dy SDt(21‐01T00:000"
  a:IOae"021‐10:00.0Z)
  ttl ,
  oa:0
  hu: {"" ,"" ,/  /"3:0}
  or   0:0 1:0 *…* 2"  ,
  mnt:{"" ,"" ,/  /"49:0}
  iue  0:0 1:0 *…* 13"  
};
)

/ eodarcmedto rae n iuebfr ingt
/Rcr  eomnaincetdoemnt eoemdih
>d.e_sudt(
 brct.pae
  {bo:OjcI(…) a:IOae"021‐10:00.0Z)}
   ok betd"",dy SDt(21‐01T00:000" ,
  {$n:{ttl ,"or2" ,"iue13"  }
   ic  oa:1 hu.3:1 mnt.49:1}
};
)
BSON STORAGE
               Sequence of key/value pairs
               Not a hash map
               Optimized to scan quickly

                        minute
                     [][]…[49
                      0 1  13]


What is the cost of updating the minute before midnight?
BSON STORAGE
   We can skip sub-documents

     hour0     …     hour23
   [][]…[9
    0 1  5]        [30  13]
                    18]…[49


How could this change the schema?
EXERCISE #5: SOLUTION B
/ brct tm eisbces ahhu  u‐o)
/d.e_s(iesre ukt,ec orasbdc
>d.e_sisr(
 brct.net{
  bo:OjcI(…)
  ok betd"",
  dy SDt(21‐01T00:000"
  a:IOae"021‐10:00.0Z)
  ttl 4,
  oa:18
  hu:{
  or 
    ""  oa:7 0:0 *…* 5"  ,
    0:{ttl ,"" ,/  /"9:2}
    ""  oa:3 6" ,/  /"1"  ,
    1:{ttl ,"0:1 *…* 19:0}
    / te or…
    /Ohrhus
    "3:{ttl 2 18" ,/  /"49:3}
    2"  oa:1,"30:0 *…* 13"  
  }
  
};
)

/ eodarcmedto rae n iuebfr ingt
/Rcr  eomnaincetdoemnt eoemdih
>d.e_sudt(
 brct.pae
  {bo:OjcI(…) a:IOae"021‐10:00.0Z)}
   ok betd"",dy SDt(21‐01T00:000" ,
  {$n:{ttl ,"or2.oa" ,"or2.49:1}
   ic  oa:1 hu.3ttl:1 hu.313"  }
};
)
SINGLE-COLLECTION INHERITANCE
  Take advantage of MongoDB's features
 Documents need not all have the same fields
 Sparsely index only present fields
SCHEMA FLEXIBILITY
>d.ok.idn(
  bbosfnOe)
{
  _d 7
   i:4,
  tte TeWzr hs"
   il:"h iadCae,
  tp:"eis,
   ye sre"
  sre_il:"h iadsTioy,
   eistte TeWzr' rlg"
  vlm:2
   oue 
  / te ilsflo…
   /Ohrfed olw
}




       Find all books that are part of a series
d.ok.id{tp:"eis )
bbosfn( ye sre"};

>d.ok.id{sre_il:{$xss re})
 bbosfn( eistte  eit:tu };

>d.ok.id{vlm:{$t  };
 bbosfn( oue  g:0})
INDEX ONLY PRESENT FIELDS
Documents without these fields will not be indexed.
>d.ok.nuene( eistte  ,{sas:tu )
 bbosesrIdx{sre_il:1}  pre re}

>d.ok.nuene( oue  ,{sas:tu )
 bbosesrIdx{vlm:1}  pre re}
EXERCISE #6
Users can recommend at most 10 books
EXERCISE #6: SOLUTION
/ bue_es(rc srsrmiigadgvnrcmedtos
/d.srrc takue' eann n ie eomnain)
>d.srrc.net{
 bue_esisr(
  _d bb,
  i:"o"
  rmiig ,
  eann:8
  bos 3 0
  ok:[,1]
};
)

/ eodarcmedto fpsil
/Rcr  eomnaini osbe
>d.srrc.pae
 bue_esudt(
  {_d bb,rmiig  g:0} ok:{$e  }
   i:"o" eann:{$t  ,bos  n:4},
  {$n:{rmiig 1} ps:{bos  }
   ic  eann:‐ ,$uh  ok:4}
};
)
EXERCISE #6: SOLUTION
  One less unassigned recommendation remaining
  Newly-recommended book is now linked
>d.srrc.idn(
  bue_esfnOe)
{
  _d bb,
   i:"o"
  rmiig ,
   eann:7
  bos 3 0 ]
   ok:[,1,4
}
EXERCISE #7
Statistic buckets
  Each book has a listing page in our application
  Record referring website domains for each book
  Count each domain independently
EXERCISE #7: SOLUTION A
>d.okrf.idn(
  bbo_esfnOe)
{ bo:1
   ok ,
  rfres 
   eerr:[
    {dmi:"ogecm,cut  ,
      oan gol.o" on:4}
    {dmi:"ao.o" on:1}
      oan yhocm,cut  
  ]
   
}



>d.okrf.pae
  bbo_esudt(
  {bo:1 rfresdmi" gol.o"}
    ok ,"eerr.oan:"ogecm ,
  {$n:{"eerr..on"  }
    ic  rfres$cut:1}
);
EXERCISE #7: SOLUTION A
Update the position of the first matched element.
>d.okrf.pae
  bbo_esudt(
  {bo:1 rfresdmi" gol.o"}
    ok ,"eerr.oan:"ogecm ,
  {$n:{"eerr..on"  }
    ic  rfres$cut:1}
);



>d.okrf.idn(
  bbo_esfnOe)
{ bo:1
   ok ,
  rfres 
   eerr:[
    {dmi:"ogecm,cut  ,
      oan gol.o" on:5}
    {dmi:"ao.o" on:1}
      oan yhocm,cut  
  ]
   
}




      What if a new referring website is used?
EXERCISE #7: SOLUTION B
>d.okrf.idn(
  bbo_esfnOe)
{ bo:1
   ok ,
  rfres 
   eerr:{
    "ogecm:5
     gol_o" ,
    "ao_o" 
     yhocm:1
  }
   
}



>d.okrf.pae
  bbo_esudt(
  {bo:1}
    ok  ,
  {$n:{"eerr.igcm:1},
    ic  rfresbn_o"  }
  tu
   re
);




    Replace dots with underscores for key names
    Increment to add a new referring website
    Upsert in case this is the book's first referrer
SHARDING
SHARDING
Ad-hoc partitioning
Consistent hashing
  Amazon DynamoDB
Range based partitioning
  Google BigTable
  Yahoo! PNUTS
  MongoDB
SHARDING IN MONGODB
Automated management
Range based partitioning
Convert to sharded system with no downtime
Fully consistent
SHARDING A COLLECTION
>d.uCmad{adhr  sad.xml.o"};
 brnomn( dsad:"hr1eapecm )

>d.uCmad{
 brnomn(
  sadolcin lbaybos,
  hrCleto:"irr.ok"
  ky  i  }
  e:{_d:1
};
)




             Keys range from −∞ to +∞
             Ranges are stored as chunks
SHARDING DATA BY CHUNKS
>d.ok.ae{_d 5 il:"alo h id )
 bbossv( i:3,tte Cl fteWl"};
>d.ok.ae{_d 0 il:"rpco acr )
 bbossv( i:4,tte Toi fCne"};
>d.ok.ae{_d 5 il:"h uge )
 bbossv( i:4,tte TeJnl"};
>d.ok.ae{_d 0 il:"fMc n e"};
 bbossv( i:5,tte O ieadMn )




                        [∞ 0
                         −,4)            [∞ 0
                                          −,4)
    [−,+)
      ∞ ∞       →       [0 ∞
                         4,+)
                                  →      [0 0
                                          4,5)
                                         [0 ∞
                                          5,+)

  Ranges are split into chunks as data is inserted
ADDING NEW SHARDS
      shard1
      [∞ 0
       −,4)
      [0 0
       4,5)
      [0 0
       5,6)
      [0 ∞
       6,+)
ADDING NEW SHARDS
 >d.uCmad{adhr  sad.xml.o"};
  brnomn( dsad:"hr2eapecm )




                shard1         shard2
                [∞ 0
                 −,4)
                               [0 0
                               4,5)
                [0 0
                 5,6)
                               [0 ∞
                               6,+)

      Chunks are migrated to balance shards
ADDING NEW SHARDS
 >d.uCmad{adhr  sad.xml.o"};
  brnomn( dsad:"hr3eapecm )




        shard1          shard2   shard3
        [∞ 0
         −,4)
                        [0 0
                         4,5)
                                 [0 0
                                  5,6)
                        [0 ∞
                         6,+)
20121023 mongodb schema-design
20121023 mongodb schema-design
SHARDING COMPONENTS
     mno
      ogs
     Config servers
     Shards
       mno
        ogd
       Replica sets
SHARDED WRITES
Inserts
   Shard key required
   Routed
Updates and removes
   Shard key optional
   May be routed or scattered
SHARDED READS
Queries
  By shard key: routed
  Without shard key: scatter/gather
Sorted queries
  By shard key: routed in order
  Without shard key: distributed merge sort
EXERCISE #8
    Users can upload images for books

                 images
                iaei:??
                mg_d ?
                dt:bnr
                aa iay




The collection will be sharded by i a e i .
                                   mg_d
       What should i a e i be?
                    mg_d
EXERCISE #8: SOLUTIONS
What's the best shard key for our use case?
         Auto-increment (ObjectId)
         MD5 of data
         Time (e.g. month) and MD5
Right-balanced Access
Random Access
Segmented Access
SUMMARY
Schema design is different in MongoDB.
Basic data design principles apply.
It's about your application.
It's about your data and how it's used.
It's about the entire lifetime of your application.
THANKS!
 QUESTIONS?
1 von 117

Recomendados

はじめてのMongoDB von
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDBTakahiro Inoue
16.1K views72 Folien
Potential Friend Finder von
Potential Friend FinderPotential Friend Finder
Potential Friend FinderRichard Schneeman
1.1K views59 Folien
MongoDB - Introduction von
MongoDB - IntroductionMongoDB - Introduction
MongoDB - IntroductionVagmi Mudumbai
1K views36 Folien
Normas apa y derechos de autor piktochart backup data (1) von
Normas apa y derechos de autor   piktochart backup data (1)Normas apa y derechos de autor   piktochart backup data (1)
Normas apa y derechos de autor piktochart backup data (1)000409123
289 views4 Folien
Code Tops Comments von
Code Tops CommentsCode Tops Comments
Code Tops CommentsMr Giap
217 views3 Folien
Kevin milla arbieto informatica piktochart backup data von
Kevin milla arbieto informatica   piktochart backup dataKevin milla arbieto informatica   piktochart backup data
Kevin milla arbieto informatica piktochart backup dataKevin Miguel Milla
271 views4 Folien

Más contenido relacionado

Was ist angesagt?

Forking Oryx at Intalio von
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at IntalioAntoine Toulme
547 views23 Folien
deepjs - tools for better programming von
deepjs - tools for better programmingdeepjs - tools for better programming
deepjs - tools for better programmingnomocas
552 views26 Folien
Elastic search 검색 von
Elastic search 검색Elastic search 검색
Elastic search 검색HyeonSeok Choi
1.9K views23 Folien
CSS: A Slippery Slope to the Backend von
CSS: A Slippery Slope to the BackendCSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendFITC
1.4K views45 Folien
NoSQL & MongoDB von
NoSQL & MongoDBNoSQL & MongoDB
NoSQL & MongoDBShuai Liu
1.7K views35 Folien
Css selectors von
Css selectorsCss selectors
Css selectorsDan Stewart
1.5K views21 Folien

Was ist angesagt?(20)

deepjs - tools for better programming von nomocas
deepjs - tools for better programmingdeepjs - tools for better programming
deepjs - tools for better programming
nomocas552 views
CSS: A Slippery Slope to the Backend von FITC
CSS: A Slippery Slope to the BackendCSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the Backend
FITC1.4K views
NoSQL & MongoDB von Shuai Liu
NoSQL & MongoDBNoSQL & MongoDB
NoSQL & MongoDB
Shuai Liu1.7K views
Great BigTable and my toys von mseki
Great BigTable and my toysGreat BigTable and my toys
Great BigTable and my toys
mseki921 views
Interactive Visualization With Bokeh (SF Python Meetup) von Peter Wang
Interactive Visualization With Bokeh (SF Python Meetup)Interactive Visualization With Bokeh (SF Python Meetup)
Interactive Visualization With Bokeh (SF Python Meetup)
Peter Wang12.7K views
Bokeh Tutorial - PyData @ Strata San Jose 2015 von Peter Wang
Bokeh Tutorial - PyData @ Strata San Jose 2015Bokeh Tutorial - PyData @ Strata San Jose 2015
Bokeh Tutorial - PyData @ Strata San Jose 2015
Peter Wang4.2K views
Schema design von christkv
Schema designSchema design
Schema design
christkv731 views
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB von MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB442 views
Inside MongoDB: the Internals of an Open-Source Database von Mike Dirolf
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
Mike Dirolf52.5K views
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip... von MongoDB
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB262 views
ELK Stack - Turn boring logfiles into sexy dashboard von Georg Sorst
ELK Stack - Turn boring logfiles into sexy dashboardELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboard
Georg Sorst5.4K views

Destacado

Securing Data in MongoDB with Gazzang and Chef von
Securing Data in MongoDB with Gazzang and ChefSecuring Data in MongoDB with Gazzang and Chef
Securing Data in MongoDB with Gazzang and ChefMongoDB
2.6K views24 Folien
Schema design mongo_boston von
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_bostonMongoDB
429 views48 Folien
What's New in the PHP Driver von
What's New in the PHP DriverWhat's New in the PHP Driver
What's New in the PHP DriverMongoDB
863 views40 Folien
MongoDB at Flight Centre Ltd von
MongoDB at Flight Centre LtdMongoDB at Flight Centre Ltd
MongoDB at Flight Centre LtdMongoDB
1.1K views17 Folien
First app online conf von
First app   online confFirst app   online conf
First app online confMongoDB
540 views42 Folien
Building Your First App: An Introduction to MongoDB von
Building Your First App: An Introduction to MongoDBBuilding Your First App: An Introduction to MongoDB
Building Your First App: An Introduction to MongoDBMongoDB
5.3K views58 Folien

Destacado(20)

Securing Data in MongoDB with Gazzang and Chef von MongoDB
Securing Data in MongoDB with Gazzang and ChefSecuring Data in MongoDB with Gazzang and Chef
Securing Data in MongoDB with Gazzang and Chef
MongoDB2.6K views
Schema design mongo_boston von MongoDB
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_boston
MongoDB429 views
What's New in the PHP Driver von MongoDB
What's New in the PHP DriverWhat's New in the PHP Driver
What's New in the PHP Driver
MongoDB863 views
MongoDB at Flight Centre Ltd von MongoDB
MongoDB at Flight Centre LtdMongoDB at Flight Centre Ltd
MongoDB at Flight Centre Ltd
MongoDB1.1K views
First app online conf von MongoDB
First app   online confFirst app   online conf
First app online conf
MongoDB540 views
Building Your First App: An Introduction to MongoDB von MongoDB
Building Your First App: An Introduction to MongoDBBuilding Your First App: An Introduction to MongoDB
Building Your First App: An Introduction to MongoDB
MongoDB5.3K views
The importance of indexes in mongo db von MongoDB
The importance of indexes in mongo dbThe importance of indexes in mongo db
The importance of indexes in mongo db
MongoDB2.6K views
Mongo db conference march 2012 (1) von MongoDB
Mongo db conference march 2012 (1)Mongo db conference march 2012 (1)
Mongo db conference march 2012 (1)
MongoDB267 views
An Evening with MongoDB - Orlando: Welcome and Keynote von MongoDB
An Evening with MongoDB - Orlando: Welcome and KeynoteAn Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and Keynote
MongoDB778 views
Use Case: Apollo Group at Oracle Open World von MongoDB
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
MongoDB633 views
MongoDB and Windows Azure von MongoDB
MongoDB and Windows AzureMongoDB and Windows Azure
MongoDB and Windows Azure
MongoDB2.7K views
Indexing & Query Optimization von MongoDB
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
MongoDB2.1K views
Schema Design von MongoDB
Schema DesignSchema Design
Schema Design
MongoDB1.2K views
Introducing MongoDB into your Organization von MongoDB
Introducing MongoDB into your OrganizationIntroducing MongoDB into your Organization
Introducing MongoDB into your Organization
MongoDB653 views
Webinar: MongoDB Connector for Spark von MongoDB
Webinar: MongoDB Connector for SparkWebinar: MongoDB Connector for Spark
Webinar: MongoDB Connector for Spark
MongoDB4.6K views
MongoDB and Web Scrapping with the Gyes Platform von MongoDB
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes Platform
MongoDB1.1K views
Case Studies: Leroy Merlin and Wellnet von MongoDB
Case Studies: Leroy Merlin and WellnetCase Studies: Leroy Merlin and Wellnet
Case Studies: Leroy Merlin and Wellnet
MongoDB2.1K views
MongoDB using PHP: Using a New Framework Called Ox von MongoDB
MongoDB using PHP: Using a New Framework Called OxMongoDB using PHP: Using a New Framework Called Ox
MongoDB using PHP: Using a New Framework Called Ox
MongoDB3.2K views
A flexible plugin like data layer - decouple your -_application logic from yo... von MongoDB
A flexible plugin like data layer - decouple your -_application logic from yo...A flexible plugin like data layer - decouple your -_application logic from yo...
A flexible plugin like data layer - decouple your -_application logic from yo...
MongoDB735 views

Similar a 20121023 mongodb schema-design

Making Mongo realtime - oplog tailing in Meteor von
Making Mongo realtime - oplog tailing in MeteorMaking Mongo realtime - oplog tailing in Meteor
Making Mongo realtime - oplog tailing in Meteoryaliceme
2.9K views26 Folien
Building modern web apps with html5, javascript, and java von
Building modern web apps with html5, javascript, and javaBuilding modern web apps with html5, javascript, and java
Building modern web apps with html5, javascript, and javaAlexander Gyoshev
10.3K views33 Folien
How ElasticSearch lives in my DevOps life von
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
16.4K views57 Folien
Simple search with elastic search von
Simple search with elastic searchSimple search with elastic search
Simple search with elastic searchmarkstory
6.6K views44 Folien
Hadoop in Data Warehousing von
Hadoop in Data WarehousingHadoop in Data Warehousing
Hadoop in Data WarehousingAlexey Grigorev
3.5K views54 Folien
Beginner workshop to angularjs presentation at Google von
Beginner workshop to angularjs presentation at GoogleBeginner workshop to angularjs presentation at Google
Beginner workshop to angularjs presentation at GoogleAri Lerner
3K views97 Folien

Similar a 20121023 mongodb schema-design(20)

Making Mongo realtime - oplog tailing in Meteor von yaliceme
Making Mongo realtime - oplog tailing in MeteorMaking Mongo realtime - oplog tailing in Meteor
Making Mongo realtime - oplog tailing in Meteor
yaliceme2.9K views
Building modern web apps with html5, javascript, and java von Alexander Gyoshev
Building modern web apps with html5, javascript, and javaBuilding modern web apps with html5, javascript, and java
Building modern web apps with html5, javascript, and java
Alexander Gyoshev10.3K views
How ElasticSearch lives in my DevOps life von 琛琳 饶
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
琛琳 饶16.4K views
Simple search with elastic search von markstory
Simple search with elastic searchSimple search with elastic search
Simple search with elastic search
markstory6.6K views
Beginner workshop to angularjs presentation at Google von Ari Lerner
Beginner workshop to angularjs presentation at GoogleBeginner workshop to angularjs presentation at Google
Beginner workshop to angularjs presentation at Google
Ari Lerner3K views
Spring scala - Sneaking Scala into your corporation von Henryk Konsek
Spring scala  - Sneaking Scala into your corporationSpring scala  - Sneaking Scala into your corporation
Spring scala - Sneaking Scala into your corporation
Henryk Konsek1.5K views
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo! von Daniel Cousineau
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
Daniel Cousineau2.1K views
Elasticsearch at EyeEm von Lars Fronius
Elasticsearch at EyeEmElasticsearch at EyeEm
Elasticsearch at EyeEm
Lars Fronius824 views
Profile Serialization IIPC GA 2015 von Sawood Alam
Profile Serialization IIPC GA 2015Profile Serialization IIPC GA 2015
Profile Serialization IIPC GA 2015
Sawood Alam3.7K views
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity) von jaxLondonConference
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
jaxLondonConference2.5K views
An Introduction to PHP Dependency Management With Composer von Oomph, Inc.
An Introduction to PHP Dependency Management With ComposerAn Introduction to PHP Dependency Management With Composer
An Introduction to PHP Dependency Management With Composer
Oomph, Inc.3.6K views
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine... von Gabriel Moreira
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Gabriel Moreira1.4K views
Discovering User's Topics of Interest in Recommender Systems von Gabriel Moreira
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Gabriel Moreira6.1K views
Faster! Faster! Accelerate your business with blazing prototypes von OSCON Byrum
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypes
OSCON Byrum4.5K views
Indexing in Cassandra von Ed Anuff
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
Ed Anuff26.6K views

Más de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas von
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
6.7K views46 Folien
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts! von
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
1.2K views20 Folien
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel... von
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
1.1K views40 Folien
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB von
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
1.4K views106 Folien
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T... von
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
782 views37 Folien
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data von
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
870 views47 Folien

Más de MongoDB(20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas von MongoDB
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB6.7K views
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts! von MongoDB
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB1.2K views
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel... von MongoDB
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB1.1K views
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB von MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB1.4K views
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T... von MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB782 views
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data von MongoDB
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB870 views
MongoDB SoCal 2020: MongoDB Atlas Jump Start von MongoDB
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB633 views
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys] von MongoDB
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB528 views
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2 von MongoDB
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB473 views
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ... von MongoDB
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB492 views
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts! von MongoDB
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB355 views
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset von MongoDB
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB383 views
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart von MongoDB
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB287 views
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin... von MongoDB
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB392 views
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++ von MongoDB
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB359 views
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo... von MongoDB
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB382 views
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive von MongoDB
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB330 views
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang von MongoDB
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB293 views
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app... von MongoDB
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB328 views
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning... von MongoDB
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB312 views

20121023 mongodb schema-design

  • 2. AGENDA 1. Basic schema design principles for MongoDB 2. Schema design over an application's lifetime 3. Common design patterns 4. Sharding
  • 3. GOALS Learn the schema design process in MongoDB Practice applying common principles via exercises Understand the implications of sharding
  • 4. WHAT IS A SCHEMA AND WHY IS IT IMPORTANT?
  • 5. SCHEMA Map concepts and relationships to data Set expectations for the data Minimize overhead of iterative modifications Ensure compatibility
  • 6. NORMALIZATION users ← books → authors uenm srae tte il frtnm is_ae frtnm is_ae ib sn ls_ae atnm ls_ae atnm lnug agae cetdb rae_y ato uhr
  • 7. DENORMALIZATION users ← books uenm srae tte il frtnm is_ae ib sn ls_ae atnm lnug agae cetdb rae_y ato uhr frtnm is_ae ls_ae atnm
  • 8. WHAT IS SCHEMA DESIGN LIKE IN MONGODB? Schema is defined at the application-level Design is part of each phase in its lifetime There is no magic formula
  • 9. MONGODB DOCUMENTS Storage in BSON → BSONSpec.org Scalars Rich types Doubles Objects Integers (32 or 64-bit) Arrays UTF-8 strings UTC Date, timestamp Binary, regex, code Object ID nlul
  • 10. TERMINOLOGY {   "ogd"   rltoa b,   mnob  :"eainld"   "aaae   dtbs"   dtbs" :"aaae,   "olcin  tbe,   cleto":"al"   "ouet   rw,   dcmn" :"o"   "ne"    idx,   idx   :"ne"   "hrig     sadn":{     "hr"  :"atto"     sad    priin,     "hr e":"atto e"     sadky  priinky   }    }
  • 11. THREE CONSIDERATIONS IN MONGODB SCHEMA DESIGN 1. The data your application needs 2. Your application's read usage of the data 3. Your application's write usage of the data
  • 12. CASE STUDY LIBRARY WEB APPLICATION Different schemas are possible
  • 13. AUTHOR SCHEMA {   "i" n,   _d:it   "is_ae:srn,   frtnm" tig   "atnm" tig   ls_ae:srn }
  • 14. USER SCHEMA {   "i" n,   _d:it   "srae:srn,   uenm" tig   "asod:srn   pswr" tig }
  • 15. BOOK SCHEMA {   "i" n,   _d:it   "il" tig   tte:srn,   "lg:srn,   su" tig   "uhr:it   ato" n,   "vial" ola,   aalbe:boen   "sn:srn,   ib" tig   "ae" n,   pgs:it   "ulse"    pbihr:{     "iy:srn,     ct" tig     "ae:dt,     dt" ae     "ae:srn     nm" tig   }   ,   "ujcs:[srn,srn ,   sbet"  tig tig]   "agae:srn,   lnug" tig   "eiw"    rves:[      ue" n,"et:srn ,    {"sr:it tx" tig}      ue" n,"et:srn     {"sr:it tx" tig}   ]   , }
  • 17. AUTHOR DOCUMENT >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead }
  • 18. USER DOCUMENT >d.sr.idn(  buesfnOe) {   _d ,   i:1   uenm:"ml@0e.o"   srae eiy1gncm,   pswr:"ljkok429ld9098d   asod ssf4d8k0dkj0023" }
  • 19. BOOK DOCUMENT >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 20. EMBEDDED OBJECTS AKA EMBEDDED OR SUB-DOCUMENTS What advantages do they have? When should they be used?
  • 21. EMBEDDED OBJECTS >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 22. EMBEDDED OBJECTS Great for read performance One seek to load the entire document One round trip to the database Writes can be slow if constantly adding to objects
  • 23. LINKED DOCUMENTS What advantages does this approach have? When should they be used?
  • 24. LINKED DOCUMENTS >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     pbihrnm:"vrmnsLbay,     ulse_ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     pbihrct:"odn     ulse_iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 25. LINKED DOCUMENTS More, smaller documents Can make queries by ID very simple Accessing linked document data requires extra read What effect does this have on the system?
  • 32. ARRAY OF SCALARS >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 33. ARRAY OF OBJECTS  d.ok.idn(  bbosfnOe) { _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]   , }
  • 34. EXERCISE #1 Design a schema for users and their book reviews Users Reviews username (string) text (string) email (string) rating (integer) created_at (date) Usernames are immutable
  • 35. EXERCISE #1: SOLUTION A Reviews may be queried by user or book / bues(n ouetprue) /d.sr oedcmn e sr { _d betd"",   i:OjcI(…)   uenm:"o"   srae bb,   eal bbeapecm   mi:"o@xml.o" } / brves(n ouetprrve) /d.eiw oedcmn e eiw { _d betd"",   i:OjcI(…)   ue:OjcI(…)   sr betd"",   bo:OjcI(…)   ok betd"",   rtn:5   aig ,   tx:"hsbo seclet"   et Ti oki xeln!,   cetda:IOae"021‐02:40.9Z)   rae_t SDt(21‐01T11:706" }
  • 36. EXERCISE #1: SOLUTION B Optimized to retrieve reviews by user / bues(n ouetprue ihalrves /d.sr oedcmn e srwt l eiw) { _d betd"",   i:OjcI(…)   uenm:"o"   srae bb,   eal bbeapecm,   mi:"o@xml.o"   rves    eiw:[     { bo:OjcI(…)       ok betd"",       rtn:5       aig ,       tx:"hsbo seclet"       et Ti oki xeln!,       cetda:IOae"021‐02:40.9Z)       rae_t SDt(21‐01T11:706"     }        ]    }
  • 37. EXERCISE #1: SOLUTION C Optimized to retrieve reviews by book / bues(n ouetprue) /d.sr oedcmn e sr { _d betd"",   i:OjcI(…)   uenm:"o"   srae bb,   eal bbeapecm   mi:"o@xml.o" } / bbos(n ouetprbo ihalrves /d.ok oedcmn e okwt l eiw) { _d betd"",   i:OjcI(…)   / te okfed…   /Ohrbo ils   rves    eiw:[     { ue:OjcI(…)       sr betd"",       rtn:5       aig ,       tx:"hsbo seclet"       et Ti oki xeln!,       cetda:IOae"021‐02:40.9Z)       rae_t SDt(21‐01T11:706"     }        ]    }
  • 38. SCHEMA DESIGN OVER AN APPLICATION'S LIFETIME Development Production Iterative Modifications
  • 39. DEVELOPMENT PHASE Basic CRUD functionality
  • 42. READS AND INDEXING Examine the query after creating an index. >d.ok.nuene( su"  )  bbosesrIdx{"lg:1} >d.ok.id{"lg:"h‐ra‐asy )epan)  bbosfn( su" tegetgtb"}.xli( {   "usr:"teCro lg1,   cro" Breusrsu_"   "sutKy  as,   iMlie":fle   "":1   n  ,   "sandbet":1   ncneOjcs  ,   "sand  ,   ncne":1   "cnnOdr  as,   saAdre":fle   "neOl":fle   idxny  as,   "Yed":0   nils  ,   "Cukkp":0   nhnSis  ,   "ils  ,   mli":0   / te ilsflo…   /Ohrfed olw }
  • 43. MULTI-KEY INDEXES Index all values in an array field.  >d.ok.nuene( sbet"  )   bbosesrIdx{"ujcs:1};
  • 44. INDEXING EMBEDDED FIELDS Index an embedded object's field.     >d.ok.nuene( pbihrnm"  )   bbosesrIdx{"ulse.ae:1} 
  • 45. QUERY OPERATORS Conditional operators $ t$ t , $ t$ t , $ e$ l , $ n$ i , $ i e g, ge l, le n, al i, nn sz, $ n , $ r$ o , $ o , $ y e$ x s s ad o, nr md tp, eit Regular expressions Value in an array $lmac eeMth Cursor methods and modifiers c u t )l m t )s i ( , s a s o ( , s r ( , on(, ii(, kp) npht) ot) acSz(, xli(, it) b t h i e )e p a n )h n (
  • 47. ATOMIC MODIFIERS Update specific fields within a document $e, $ne st ust ps, psAl $ u h$ u h l adoe, pp $ d T S t$ o pl, plAl $ u l$ u l l $eae rnm $ibt
  • 49. PRODUCTION PHASE Evolve schema to meet the application's read and write patterns
  • 50. READ USAGE Finding books by an author's first name  atos=d.uhr.id{frtnm:/f*i}  i:1};  uhr  batosfn( is_ae ^./ ,{_d  )  atoIs=atosmpfnto(){rtr .i;};  uhrd  uhr.a(ucinx  eunx_d )  d.ok.id{uhr  i:atoIs})  bbosfn(ato:{$n uhrd };
  • 51. READ USAGE "Cache" the author name in an embedded document >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   ato:{   uhr      frtnm:".Sot,     is_ae F ct"     ls_ae Ftgrl"     atnm:"izead   }      / te ilsflo…   /Ohrfed olw } Queries are now one step  >d.ok.id{ato.is_ae ^./ )   bbosfn( uhrfrtnm:/f*i}
  • 52. WRITE USAGE Users can review a book rve   eiw={   ue:1   sr ,   tx:" huh hsbo a ra!,   et Itogtti okwsget"   rtn:5   aig  };  >d.ok.pae   bbosudt(   {_d  ,    i:3}   {$uh  eiw:rve }    ps:{rves eiw} ); Document size limit (16MB) Storage fragmentation after many updates/deletes
  • 53. EXERCISE #2 Display the 10 most recent reviews by a user Make efficient use of memory and disk seeks
  • 54. EXERCISE #2: SOLUTION Store users' reviews in monthly buckets / brves(n ouetprue e ot) /d.eiw oedcmn e srprmnh { _d bb211"   i:"o‐020,   rves    eiw:[     { _d betd"",       i:OjcI(…)       rtn:5       aig ,       tx:"hsbo seclet"       et Ti oki xeln!,       cetda:IOae"021‐02:40.9Z)       rae_t SDt(21‐01T11:706"     }     ,     { _d betd"",       i:OjcI(…)       rtn:2       aig ,       tx:" intral no hsbo.,       et Idd' elyejyti ok"       cetda:IOae"021‐12:25.9Z)       rae_t SDt(21‐01T01:054"     }        ]    }
  • 55. EXERCISE #2: SOLUTION Adding a new review to the appropriate bucket mRve   yeiw={   _d betd"",   i:OjcI(…)   rtn:3   aig ,   tx:"naeaera.,   et A vrg ed"   cetda:IOae"021‐31:61.0Z)   rae_t SDt(21‐01T22:152" }; >d.eiw.pae  brvesudt(    {_d bb21‐0 ,     i:"o‐021"}    {$uh  eiw:mRve }     ps:{rves yeiw} );
  • 56. EXERCISE #2: SOLUTION Display the 10 most recent reviews by a user cro  brvesfn( usr=d.eiw.id   {_d ^o‐ ,    i:/bb/}   {rves  sie 0}    eiw:{$lc:1 } )sr( i:‐ ) .ot{_d 1}; nm=0 u  ; wie(usrhset)& u  0  hl cro.aNx( &nm<1){   dc=cro.et)   o  usrnx(;   fr(a   ;i<dcrveslnt &nm<1;+i +u){   o vri=0   o.eiw.egh& u  0 +,+nm      pitsndcrvesi)     rnjo(o.eiw[];   }    }
  • 57. EXERCISE #2: SOLUTION Deleting a review cro  brvesudt( usr=d.eiw.pae   {_d bb21‐0 ,    i:"o‐021"}   {$ul  eiw:{_d betd"" }    pl:{rves  i:OjcI(…)}} );
  • 59. ALLOW USERS TO BROWSE BY BOOK SUBJECT >d.ujcsfnOe)  bsbet.idn( {   _d ,   i:1   nm:"mrcnLtrtr"   ae Aeia ieaue,   sbctgr:{   u_aeoy       ae 12s,     nm:"90"      u_aeoy  ae Jz g"}     sbctgr:{nm:"azAe      } } How can you search this collection? Be aware of document size limitations Benefit from hierarchy being in same document
  • 60. TREE STRUCTURES >d.ujcsfn(  bsbet.id) { _d Aeia ieaue    i:"mrcnLtrtr"} { _d:"90"   i  12s,   acsos "mrcnLtrtr",   netr:[Aeia ieaue]   prn:"mrcnLtrtr"   aet Aeia ieaue } { _d Jz g"   i:"azAe,   acsos "mrcnLtrtr" 12s]   netr:[Aeia ieaue,"90",   prn:"90"   aet 12s } { _d Jz g nNwYr"   i:"azAei e ok,   acsos "mrcnLtrtr" 12s,"azAe]   netr:[Aeia ieaue,"90" Jz g",   prn:"azAe   aet Jz g" }
  • 61. TREE STRUCTURES Find sub-categories of a given subject >d.ujcsfn( netr:"90"}  bsbet.id{acsos 12s ) {   _d Jz g"   i:"azAe,   acsos "mrcnLtrtr" 12s]   netr:[Aeia ieaue,"90",   prn:"90"   aet 12s } {   _d Jz g nNwYr"   i:"azAei e ok,   acsos "mrcnLtrtr" 12s,"azAe]   netr:[Aeia ieaue,"90" Jz g",   prn:"azAe   aet Jz g" }
  • 62. EXERCISE #3 Allow users to borrow library books User sends a loan request Library approves or not Requests time out after seven days Approval process is asynchronous Requests may be prioritized
  • 63. EXERCISE #3: SOLUTION Need to maintain order and state Ensure that updates are atomic / raeanwla eus /Cet  e onrqet >d.on.net{  blasisr(   _d  orwr bb,bo:OjcI(…)}   i:{broe:"o" ok betd"" ,   pnig as,   edn:fle   apoe:fle   prvd as,   pirt:1   roiy , }; ) / idtehgetpirt eus n aka edn prvl /Fn h ihs roiyrqetadmr spnigapoa rqet=d.on.idnMdf( eus  blasfnAdoiy{   qey  edn:fle}   ur:{pnig as ,   sr:{pirt:‐ ,   ot  roiy 1}   udt:{$e:{pnig re tre:nwIOae)},   pae  st  edn:tu,satd e SDt( }   nw re   e:tu }; )
  • 64. EXERCISE #3: SOLUTION Updated and added fields Modified document was returned {   _d  orwr bb,bo:OjcI(…)}   i:{broe:"o" ok betd"" ,   pnig re   edn:tu,   apoe:fle   prvd as,   pirt:1   roiy ,   satd SDt(21‐01T20:252"   tre:IOae"021‐12:94.4Z) }
  • 65. EXERCISE #3: SOLUTION / irr prvstela eus /Lbayapoe h onrqet >d.on.pae  blasudt(   {_d  orwr bb,bo:OjcI(…)},    i:{broe:"o" ok betd"" }   {$e:{pnig as,apoe:tu }    st  edn:fle prvd re} );
  • 66. EXERCISE #3: SOLUTION / eus ie u fe ee as /Rqettmsotatrsvndy lmt=nwDt(; ii  e ae) lmtstaelmtgtae)‐7; ii.eDt(ii.eDt(  ) >d.on.pae  blasudt(   {pnig re tre:{$t ii }    edn:tu,satd  l:lmt},   {$e:{pnig as,apoe:fle}    st  edn:fle prvd as } );
  • 67. EXERCISE #4 Allow users to recommend books Users can recommend each book only once Display a book's current recommendations
  • 68. EXERCISE #4: SOLUTION / brcmedtos(n ouetprue e ok /d.eomnain oedcmn e srprbo) >d.eomnain.net{  brcmedtosisr(   bo:OjcI(…)   ok betd"",   ue:OjcI(…)   sr betd"" }; ) / nqeidxesrsuescntrcmedtie /Uiu ne nue sr a' eomn wc >d.eomnain.nuene(  brcmedtosesrIdx   {bo:1 sr  ,    ok ,ue:1}   {uiu:tu     nqe re} ); / on h ubro eomnain o  ok /Cuttenme frcmedtosfrabo >d.eomnain.on( ok betd"" )  brcmedtoscut{bo:OjcI(…)};
  • 69. EXERCISE #4: SOLUTION Indexes in MongoDB are not counting Counts are computed via index scans Denormalize totals on books >d.ok.pae  bbosudt(   {_d betd"" ,    i:OjcI(…)}   {$n:{rcmedtos  }    ic  eomnain:1} }; )
  • 70. COMMON DESIGN PATTERNS
  • 71. ONE-TO-ONE RELATIONSHIP Let's pretend that authors only write one book.
  • 72. LINKING Either side, or both, can track the relationship. >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   / te ilsflo…   /Ohrfed olw } >d.uhr.idn( i:1}  batosfnOe{_d  ) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead   bo:1   ok , }
  • 73. EMBEDDED OBJECT >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:{   uhr      frtnm:".Sot,     is_ae F ct"     ls_ae Ftgrl"     atnm:"izead   }      / te ilsflo…   /Ohrfed olw }
  • 74. ONE-TO-MANY RELATIONSHIP In reality, authors may write multiple books.
  • 75. ARRAY OF ID'S The "one" side tracks the relationship. >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead,   bos 1 ,2]   ok:[,3 0 } Flexible and space-efficient Additional query needed for non-ID lookups
  • 76. SINGLE FIELD WITH ID The "many" side tracks the relationship. >d.ok.id{ato:1}  bbosfn( uhr  ) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   / te ilsflo…   /Ohrfed olw } {   _d ,   i:3   tte Ti ieo aaie,   il:"hsSd fPrds"   su:"707473‐hssd‐fprds"   lg 9869428ti‐ieo‐aaie,   ato:1   uhr ,   / te ilsflo…   /Ohrfed olw }
  • 77. ARRAY OF OBJECTS >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead,   bos    ok:[     {_d ,tte TeGetGtb"}      i:1 il:"h ra asy ,     {_d ,tte Ti ieo aaie       i:3 il:"hsSd fPrds"}   ]      / te ilsflo…   /Ohrfed olw } Use $ l c operator to return a subset of books sie
  • 78. MANY-TO-MANY RELATIONSHIP Some books may also have co-authors.
  • 79. ARRAY OF ID'S ON BOTH SIDES >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   atos 1 ]   uhr:[,5   / te ilsflo…   /Ohrfed olw } >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead,   bos 1 ,2]   ok:[,3 0 }
  • 80. ARRAY OF ID'S ON BOTH SIDES Query for all books by a given author >d.ok.id{atos  )  bbosfn( uhr:1}; Query for all authors of a given book >d.uhr.id{bos  )  batosfn( ok:1};
  • 81. ARRAY OF ID'S ON ONE SIDE >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   atos 1 ]   uhr:[,5   / te ilsflo…   /Ohrfed olw } >d.uhr.idn( i:{$n 1 ]})  batosfnOe{_d  i:[,5 } {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead } {   _d ,   i:5   frtnm:"nnw"   is_ae Ukon,   ls_ae C‐uhr   atnm:"oato" }
  • 82. ARRAY OF ID'S ON ONE SIDE Query for all books by a given author  >d.ok.id{atos  )   bbosfn( uhr:1}; Query for all authors of a given book bo  bbosfnOe ok=d.ok.idn(   {tte TeGetGtb"}    il:"h ra asy ,   {atos      uhr:1} ); d.uhr.id{_d  i:bo.uhr }; batosfn( i:{$n okatos})
  • 83. EXERCISE #5 Tracking time series data Graph recommendations per unit of time Count by: day, hour, minute
  • 84. EXERCISE #5: SOLUTION A / brct tm eisbces oradmnt u‐os /d.e_s(iesre ukt,hu n iuesbdc) >d.e_sisr(  brct.net{   bo:OjcI(…)   ok betd"",   dy SDt(21‐01T00:000"   a:IOae"021‐10:00.0Z)   ttl ,   oa:0   hu: {"" ,"" ,/  /"3:0}   or   0:0 1:0 *…* 2"  ,   mnt:{"" ,"" ,/  /"49:0}   iue  0:0 1:0 *…* 13"   }; ) / eodarcmedto rae n iuebfr ingt /Rcr  eomnaincetdoemnt eoemdih >d.e_sudt(  brct.pae   {bo:OjcI(…) a:IOae"021‐10:00.0Z)}    ok betd"",dy SDt(21‐01T00:000" ,   {$n:{ttl ,"or2" ,"iue13"  }    ic  oa:1 hu.3:1 mnt.49:1} }; )
  • 85. BSON STORAGE Sequence of key/value pairs Not a hash map Optimized to scan quickly minute [][]…[49 0 1  13] What is the cost of updating the minute before midnight?
  • 86. BSON STORAGE We can skip sub-documents hour0 … hour23 [][]…[9 0 1  5] [30  13] 18]…[49 How could this change the schema?
  • 87. EXERCISE #5: SOLUTION B / brct tm eisbces ahhu  u‐o) /d.e_s(iesre ukt,ec orasbdc >d.e_sisr(  brct.net{   bo:OjcI(…)   ok betd"",   dy SDt(21‐01T00:000"   a:IOae"021‐10:00.0Z)   ttl 4,   oa:18   hu:{   or      ""  oa:7 0:0 *…* 5"  ,     0:{ttl ,"" ,/  /"9:2}     ""  oa:3 6" ,/  /"1"  ,     1:{ttl ,"0:1 *…* 19:0}     / te or…     /Ohrhus     "3:{ttl 2 18" ,/  /"49:3}     2"  oa:1,"30:0 *…* 13"     }    }; ) / eodarcmedto rae n iuebfr ingt /Rcr  eomnaincetdoemnt eoemdih >d.e_sudt(  brct.pae   {bo:OjcI(…) a:IOae"021‐10:00.0Z)}    ok betd"",dy SDt(21‐01T00:000" ,   {$n:{ttl ,"or2.oa" ,"or2.49:1}    ic  oa:1 hu.3ttl:1 hu.313"  } }; )
  • 88. SINGLE-COLLECTION INHERITANCE Take advantage of MongoDB's features Documents need not all have the same fields Sparsely index only present fields
  • 89. SCHEMA FLEXIBILITY >d.ok.idn(  bbosfnOe) {   _d 7   i:4,   tte TeWzr hs"   il:"h iadCae,   tp:"eis,   ye sre"   sre_il:"h iadsTioy,   eistte TeWzr' rlg"   vlm:2   oue    / te ilsflo…   /Ohrfed olw } Find all books that are part of a series d.ok.id{tp:"eis ) bbosfn( ye sre"}; >d.ok.id{sre_il:{$xss re})  bbosfn( eistte  eit:tu }; >d.ok.id{vlm:{$t  };  bbosfn( oue  g:0})
  • 90. INDEX ONLY PRESENT FIELDS Documents without these fields will not be indexed. >d.ok.nuene( eistte  ,{sas:tu )  bbosesrIdx{sre_il:1}  pre re} >d.ok.nuene( oue  ,{sas:tu )  bbosesrIdx{vlm:1}  pre re}
  • 91. EXERCISE #6 Users can recommend at most 10 books
  • 93. EXERCISE #6: SOLUTION One less unassigned recommendation remaining Newly-recommended book is now linked >d.srrc.idn(  bue_esfnOe) {   _d bb,   i:"o"   rmiig ,   eann:7   bos 3 0 ]   ok:[,1,4 }
  • 94. EXERCISE #7 Statistic buckets Each book has a listing page in our application Record referring website domains for each book Count each domain independently
  • 95. EXERCISE #7: SOLUTION A >d.okrf.idn(  bbo_esfnOe) { bo:1   ok ,   rfres    eerr:[     {dmi:"ogecm,cut  ,      oan gol.o" on:4}     {dmi:"ao.o" on:1}      oan yhocm,cut     ]    } >d.okrf.pae  bbo_esudt(   {bo:1 rfresdmi" gol.o"}    ok ,"eerr.oan:"ogecm ,   {$n:{"eerr..on"  }    ic  rfres$cut:1} );
  • 96. EXERCISE #7: SOLUTION A Update the position of the first matched element. >d.okrf.pae  bbo_esudt(   {bo:1 rfresdmi" gol.o"}    ok ,"eerr.oan:"ogecm ,   {$n:{"eerr..on"  }    ic  rfres$cut:1} ); >d.okrf.idn(  bbo_esfnOe) { bo:1   ok ,   rfres    eerr:[     {dmi:"ogecm,cut  ,      oan gol.o" on:5}     {dmi:"ao.o" on:1}      oan yhocm,cut     ]    } What if a new referring website is used?
  • 97. EXERCISE #7: SOLUTION B >d.okrf.idn(  bbo_esfnOe) { bo:1   ok ,   rfres    eerr:{     "ogecm:5     gol_o" ,     "ao_o"      yhocm:1   }    } >d.okrf.pae  bbo_esudt(   {bo:1}    ok  ,   {$n:{"eerr.igcm:1},    ic  rfresbn_o"  }   tu   re ); Replace dots with underscores for key names Increment to add a new referring website Upsert in case this is the book's first referrer
  • 99. SHARDING Ad-hoc partitioning Consistent hashing Amazon DynamoDB Range based partitioning Google BigTable Yahoo! PNUTS MongoDB
  • 100. SHARDING IN MONGODB Automated management Range based partitioning Convert to sharded system with no downtime Fully consistent
  • 102. SHARDING DATA BY CHUNKS >d.ok.ae{_d 5 il:"alo h id )  bbossv( i:3,tte Cl fteWl"}; >d.ok.ae{_d 0 il:"rpco acr )  bbossv( i:4,tte Toi fCne"}; >d.ok.ae{_d 5 il:"h uge )  bbossv( i:4,tte TeJnl"}; >d.ok.ae{_d 0 il:"fMc n e"};  bbossv( i:5,tte O ieadMn ) [∞ 0 −,4) [∞ 0 −,4) [−,+)  ∞ ∞ → [0 ∞ 4,+) → [0 0 4,5) [0 ∞ 5,+) Ranges are split into chunks as data is inserted
  • 103. ADDING NEW SHARDS shard1 [∞ 0 −,4) [0 0 4,5) [0 0 5,6) [0 ∞ 6,+)
  • 104. ADDING NEW SHARDS  >d.uCmad{adhr  sad.xml.o"};   brnomn( dsad:"hr2eapecm ) shard1 shard2 [∞ 0 −,4) [0 0 4,5) [0 0 5,6) [0 ∞ 6,+) Chunks are migrated to balance shards
  • 105. ADDING NEW SHARDS  >d.uCmad{adhr  sad.xml.o"};   brnomn( dsad:"hr3eapecm ) shard1 shard2 shard3 [∞ 0 −,4) [0 0 4,5) [0 0 5,6) [0 ∞ 6,+)
  • 108. SHARDING COMPONENTS mno ogs Config servers Shards mno ogd Replica sets
  • 109. SHARDED WRITES Inserts Shard key required Routed Updates and removes Shard key optional May be routed or scattered
  • 110. SHARDED READS Queries By shard key: routed Without shard key: scatter/gather Sorted queries By shard key: routed in order Without shard key: distributed merge sort
  • 111. EXERCISE #8 Users can upload images for books images iaei:?? mg_d ? dt:bnr aa iay The collection will be sharded by i a e i . mg_d What should i a e i be? mg_d
  • 112. EXERCISE #8: SOLUTIONS What's the best shard key for our use case? Auto-increment (ObjectId) MD5 of data Time (e.g. month) and MD5
  • 116. SUMMARY Schema design is different in MongoDB. Basic data design principles apply. It's about your application. It's about your data and how it's used. It's about the entire lifetime of your application.