SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Downloaden Sie, um offline zu lesen
Indexing and Query Optimization
                                  Paul Pedersen




Monday, October 15, 12
What’s in store

      • What are indexes?

      • Picking the right indexes.

      • Creating indexes in MongoDB

      • Troubleshooting




Monday, October 15, 12
Indexes are the single biggest
         tunable performance factor
                in MongoDB.




Monday, October 15, 12
Absent or suboptimal indexes are
      the most common avoidable
    MongoDB performance problem.




Monday, October 15, 12
So what problem do indexes solve?




Monday, October 15, 12
Monday, October 15, 12
How do you find a chicken recipe?


      • An unindexed cookbook might be quite a
          page turner.

      • Probably not what you want, though.




Monday, October 15, 12
I know, I’ll use an index!




Monday, October 15, 12
Monday, October 15, 12
Let’s imagine a simple index
                         ingredient         page

                          aardvark           790
                             ...              ...

                            beef      190, 191, 205, ...

                             ...              ...

                          chicken     182, 199, 200, ...

                          chorizo          497, ...

                             ...              ...

                          zucchini      673, 986, ...




Monday, October 15, 12
How do you find a quick chicken recipe?




Monday, October 15, 12
Let’s imagine a compound index
                         ingredient   cooking      page
                                        time
                             ...        ...         ...

                          chicken     15 min     182, 200

                          chicken     25 min       199

                          chicken     30 min    289,316,320

                          chicken     45 min     290, 291,
                                                   354
                             ...        ...         ...




Monday, October 15, 12
Consider the ordering of index keys




  Aardvark, 20 min Chicken, 15 min                Zuchinni, 45 min
                         Chicken, 25 min
                            Chicken, 30 min
                                Chicken, 45 min




Monday, October 15, 12
How about a low-calorie chicken recipe?




Monday, October 15, 12
Let’s imagine a 2nd compound index

                         ingredient   calories    page


                             ...         ...        ...

                          chicken       250      199, 316


                          chicken       300      289,291


                          chicken       425        320


                             ...         ...        ...




Monday, October 15, 12
How about a quick, low-calorie recipe?




Monday, October 15, 12
Let’s imagine a last compound index
                         calories   cooking time      page

                            ...          ...            ...
                           250        25 min           199

                           250        30 min           316

                           300        25 min           289

                           300        45 min           291
                           425        30 min           320

                            ...          ...            ...

          How do you find dishes from 250 to 300 calories that cook from
                               30 to 40 minutes?



Monday, October 15, 12
Consider the ordering of index keys


              250 cal,      250 cal,     300 cal,     300 cal,     425 cal,
              25 min        30 min       25 min       45 min       30 min




          How do you find dishes from 250 to 300 calories that cook from
                               30 to 40 minutes?

                  4 index entries will be scanned, but only 1 will match!




Monday, October 15, 12
Range queries using an index on A, B
      • A is a range 

      • A is constant, B is a range 

      • A is constant, order by B 

      • A is range, B is constant/range   

      • B is constant/range, A unspecified 


Monday, October 15, 12
It’s really that straightforward.




Monday, October 15, 12
B-Trees             (Bayer & McCreight ’72)




Monday, October 15, 12
B-Trees             (Bayer & McCreight ’72)




                                           13




Monday, October 15, 12
B-Trees             (Bayer & McCreight ’72)




                                           13

            Queries, Inserts, Deletes: O(log n)


Monday, October 15, 12
All this is relevant to MongoDB.
      • MongoDB’s indexes are B-Trees, which are
          designed for range queries.

      • Generally, the best index for your queries is
          going to be a compound index.

      • Every additional index slows down inserts &
          removes, and may slow updates.




Monday, October 15, 12
On to MongoDB!




Monday, October 15, 12
Declaring Indexes
      • db.foo.ensureIndex( { username : 1 } )




Monday, October 15, 12
Declaring Indexes
      • db.foo.ensureIndex( { username : 1 } )


      • db.foo.ensureIndex( { username : 1, created_at : -1 } )




Monday, October 15, 12
And managing them....
         > db.system.indexes.find() //db.foo.getIndexes()

           { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" }
           { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" }




Monday, October 15, 12
And managing them....
         > db.system.indexes.find() //db.foo.getIndexes()

           { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" }
           { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" }



         > db.foo.dropIndex( { username : 1} )

            { "nIndexesWas" : 2 , "ok" : 1 }




Monday, October 15, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.




Monday, October 15, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.

      • “_id” index is automatic
                  (except capped collections before 2.2)




Monday, October 15, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.

      • “_id” index is automatic
                  (except capped collections before 2.2)


      • All queries can use just 1 index
                  (except $or queries).




Monday, October 15, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.

      • “_id” index is automatic
                  (except capped collections before 2.2)


      • All queries can use just 1 index
                  (except $or queries).


      • The maximum index key size is 1024 bytes.



Monday, October 15, 12
Indexes get used where you’d expect
           • db.foo.find({x : 42})
           • db.foo.find({x : {$in : [42,52]}})
           • db.foo.find({x : {$lt : 42})
           • update, findAndModify that select on x,
           • count, distinct,
           • $match in aggregation
           • left-anchored regexp, e.g. /^Kev/




Monday, October 15, 12
But indexes aren’t always helpful
      • Most negations: $not, $nin, $ne


      • Some corner cases: $mod, $where


      • Matching most regular expressions, e.g. /a/
          or /foo/i




Monday, October 15, 12
Advanced Options




Monday, October 15, 12
Arrays: the powerful “multiKey” index
           { title : “Chicken Noodle Soup”,
             ingredients : [“chicken”, “noodles”] }

           > db.foo.ensureIndex( { ingredients : 1 } )

                         ingredients   page
                           chicken      42
                             ...         ...
                          noodles       42
                             ...         ...




Monday, October 15, 12
Unique Indexes
     • db.foo.ensureIndex( { email : 1 } , {unique : true} )



         > db.foo.insert({email : “matulef@10gen.com”})
         > db.foo.insert({email : “matulef@10gen.com”})
            E11000 duplicate key error ...




Monday, October 15, 12
Sparse Indexes
     • db.foo.ensureIndex( { email : 1 } , {sparse : true} )


                  No index entries for docs without “email” field




Monday, October 15, 12
Geospatial Indexes
         { name: "10gen Office",
           lat_long: [ 52.5184, 13.387 ] }

         > db.foo.ensureIndex( { lat_long : “2d” } )

          > db.locations.find( { lat_long: {$near: [52.53, 13.4] } } )




Monday, October 15, 12
Troubleshooting




Monday, October 15, 12
The Query Optimizer
      • For each “type” of query, mongoDB
          periodically tries all useful indexes.

      • Aborts as soon as one plan wins.

      • Winning plan is temporarily cached.




Monday, October 15, 12
Which plan wins? Explain!
      > db.foo.find( { t: { $lt : 40 } } ).explain( )
      {
        "cursor" : "BtreeCursor t_1" ,
        "n" : 42,
        “nscannedObjects: 42
        "nscanned" : 42,
         ...
        "millis" : 0,
         ...
      }




Monday, October 15, 12
Which plan wins? Explain!
      > db.foo.find( { t: { $lt : 40 } } ).explain( )
      {
        "cursor" : "BtreeCursor t_1" ,
        "n" : 42,                          Pay attention to the
        “nscannedObjects: 42
        "nscanned" : 42,                    ratio n/nscanned!
         ...
        "millis" : 0,
         ...
      }




Monday, October 15, 12
Think you know better? Give us a hint
      > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} )




Monday, October 15, 12
Recording slow queries
      > db.setProfilingLevel( n , slowms=100ms )

      n=0 profiler off
      n=1 record queries longer than slowms
      n=2 record all queries

      > db.system.profile.find()




Monday, October 15, 12
Operational Tips




Monday, October 15, 12
Background index builds
         db.foo.ensureIndex( { user : 1 } , { background : true } )

         Caveats:
          • still resource-intensive
          • will build in foreground on secondaries




Monday, October 15, 12
Minimizing impact on Replica Sets
          for (s in secondaries)
              s.restartAsStandalone()
              s.buildIndex()
              s.restartAsReplSetMember()
              s.waitForCatchup()

          p.stepDown()
          p.restartAsStandalone()
          p.buildIndex()
          p.restartAsReplSetMember()




Monday, October 15, 12
Absent or suboptimal indexes are
     the most common avoidable
   MongoDB performance problem...

    ...so take some time and get your
               indexes right!


Monday, October 15, 12
Thanks!




Monday, October 15, 12

Weitere ähnliche Inhalte

Andere mochten auch

Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.Radhe Krishna Rajan
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query OptimizationMongoDB
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
Hashing and Hash Tables
Hashing and Hash TablesHashing and Hash Tables
Hashing and Hash Tablesadil raja
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashingRafi Dar
 
Query Optimization
Query OptimizationQuery Optimization
Query Optimizationrohitsalunke
 
12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMSkoolkampus
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practiceJano Suchal
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashingJeet Poria
 
Query optimization
Query optimizationQuery optimization
Query optimizationdixitdavey
 
SQL Joins and Query Optimization
SQL Joins and Query OptimizationSQL Joins and Query Optimization
SQL Joins and Query OptimizationBrian Gallagher
 

Andere mochten auch (14)

Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Hash Function
Hash FunctionHash Function
Hash Function
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
Hash tables
Hash tablesHash tables
Hash tables
 
Hashing and Hash Tables
Hashing and Hash TablesHashing and Hash Tables
Hashing and Hash Tables
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashing
 
Query Optimization
Query OptimizationQuery Optimization
Query Optimization
 
12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS12. Indexing and Hashing in DBMS
12. Indexing and Hashing in DBMS
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practice
 
Indexing and hashing
Indexing and hashingIndexing and hashing
Indexing and hashing
 
Query optimization
Query optimizationQuery optimization
Query optimization
 
SQL Joins and Query Optimization
SQL Joins and Query OptimizationSQL Joins and Query Optimization
SQL Joins and Query Optimization
 
Hashing
HashingHashing
Hashing
 

Mehr von Insight Technology, Inc.

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?Insight Technology, Inc.
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Insight Technology, Inc.
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明するInsight Technology, Inc.
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーンInsight Technology, Inc.
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとInsight Technology, Inc.
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?Insight Technology, Inc.
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームInsight Technology, Inc.
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門Insight Technology, Inc.
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー Insight Technology, Inc.
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?Insight Technology, Inc.
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Insight Technology, Inc.
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?Insight Technology, Inc.
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...Insight Technology, Inc.
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 Insight Technology, Inc.
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Insight Technology, Inc.
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]Insight Technology, Inc.
 

Mehr von Insight Technology, Inc. (20)

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
Docker and the Oracle Database
Docker and the Oracle DatabaseDocker and the Oracle Database
Docker and the Oracle Database
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごと
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォーム
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門
 
Lunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL ServicesLunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL Services
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
 

Kürzlich hochgeladen

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

A17 indexing and query optimization by paul pederson

  • 1. Indexing and Query Optimization Paul Pedersen Monday, October 15, 12
  • 2. What’s in store • What are indexes? • Picking the right indexes. • Creating indexes in MongoDB • Troubleshooting Monday, October 15, 12
  • 3. Indexes are the single biggest tunable performance factor in MongoDB. Monday, October 15, 12
  • 4. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem. Monday, October 15, 12
  • 5. So what problem do indexes solve? Monday, October 15, 12
  • 7. How do you find a chicken recipe? • An unindexed cookbook might be quite a page turner. • Probably not what you want, though. Monday, October 15, 12
  • 8. I know, I’ll use an index! Monday, October 15, 12
  • 10. Let’s imagine a simple index ingredient page aardvark 790 ... ... beef 190, 191, 205, ... ... ... chicken 182, 199, 200, ... chorizo 497, ... ... ... zucchini 673, 986, ... Monday, October 15, 12
  • 11. How do you find a quick chicken recipe? Monday, October 15, 12
  • 12. Let’s imagine a compound index ingredient cooking page time ... ... ... chicken 15 min 182, 200 chicken 25 min 199 chicken 30 min 289,316,320 chicken 45 min 290, 291, 354 ... ... ... Monday, October 15, 12
  • 13. Consider the ordering of index keys Aardvark, 20 min Chicken, 15 min Zuchinni, 45 min Chicken, 25 min Chicken, 30 min Chicken, 45 min Monday, October 15, 12
  • 14. How about a low-calorie chicken recipe? Monday, October 15, 12
  • 15. Let’s imagine a 2nd compound index ingredient calories page ... ... ... chicken 250 199, 316 chicken 300 289,291 chicken 425 320 ... ... ... Monday, October 15, 12
  • 16. How about a quick, low-calorie recipe? Monday, October 15, 12
  • 17. Let’s imagine a last compound index calories cooking time page ... ... ... 250 25 min 199 250 30 min 316 300 25 min 289 300 45 min 291 425 30 min 320 ... ... ... How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes? Monday, October 15, 12
  • 18. Consider the ordering of index keys 250 cal, 250 cal, 300 cal, 300 cal, 425 cal, 25 min 30 min 25 min 45 min 30 min How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes? 4 index entries will be scanned, but only 1 will match! Monday, October 15, 12
  • 19. Range queries using an index on A, B • A is a range  • A is constant, B is a range  • A is constant, order by B  • A is range, B is constant/range  • B is constant/range, A unspecified  Monday, October 15, 12
  • 20. It’s really that straightforward. Monday, October 15, 12
  • 21. B-Trees (Bayer & McCreight ’72) Monday, October 15, 12
  • 22. B-Trees (Bayer & McCreight ’72) 13 Monday, October 15, 12
  • 23. B-Trees (Bayer & McCreight ’72) 13 Queries, Inserts, Deletes: O(log n) Monday, October 15, 12
  • 24. All this is relevant to MongoDB. • MongoDB’s indexes are B-Trees, which are designed for range queries. • Generally, the best index for your queries is going to be a compound index. • Every additional index slows down inserts & removes, and may slow updates. Monday, October 15, 12
  • 25. On to MongoDB! Monday, October 15, 12
  • 26. Declaring Indexes • db.foo.ensureIndex( { username : 1 } ) Monday, October 15, 12
  • 27. Declaring Indexes • db.foo.ensureIndex( { username : 1 } ) • db.foo.ensureIndex( { username : 1, created_at : -1 } ) Monday, October 15, 12
  • 28. And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" } Monday, October 15, 12
  • 29. And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" } > db.foo.dropIndex( { username : 1} ) { "nIndexesWas" : 2 , "ok" : 1 } Monday, October 15, 12
  • 30. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. Monday, October 15, 12
  • 31. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) Monday, October 15, 12
  • 32. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries). Monday, October 15, 12
  • 33. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries). • The maximum index key size is 1024 bytes. Monday, October 15, 12
  • 34. Indexes get used where you’d expect • db.foo.find({x : 42}) • db.foo.find({x : {$in : [42,52]}}) • db.foo.find({x : {$lt : 42}) • update, findAndModify that select on x, • count, distinct, • $match in aggregation • left-anchored regexp, e.g. /^Kev/ Monday, October 15, 12
  • 35. But indexes aren’t always helpful • Most negations: $not, $nin, $ne • Some corner cases: $mod, $where • Matching most regular expressions, e.g. /a/ or /foo/i Monday, October 15, 12
  • 37. Arrays: the powerful “multiKey” index { title : “Chicken Noodle Soup”, ingredients : [“chicken”, “noodles”] } > db.foo.ensureIndex( { ingredients : 1 } ) ingredients page chicken 42 ... ... noodles 42 ... ... Monday, October 15, 12
  • 38. Unique Indexes • db.foo.ensureIndex( { email : 1 } , {unique : true} ) > db.foo.insert({email : “matulef@10gen.com”}) > db.foo.insert({email : “matulef@10gen.com”}) E11000 duplicate key error ... Monday, October 15, 12
  • 39. Sparse Indexes • db.foo.ensureIndex( { email : 1 } , {sparse : true} ) No index entries for docs without “email” field Monday, October 15, 12
  • 40. Geospatial Indexes { name: "10gen Office", lat_long: [ 52.5184, 13.387 ] } > db.foo.ensureIndex( { lat_long : “2d” } ) > db.locations.find( { lat_long: {$near: [52.53, 13.4] } } ) Monday, October 15, 12
  • 42. The Query Optimizer • For each “type” of query, mongoDB periodically tries all useful indexes. • Aborts as soon as one plan wins. • Winning plan is temporarily cached. Monday, October 15, 12
  • 43. Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, “nscannedObjects: 42 "nscanned" : 42, ... "millis" : 0, ... } Monday, October 15, 12
  • 44. Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, Pay attention to the “nscannedObjects: 42 "nscanned" : 42, ratio n/nscanned! ... "millis" : 0, ... } Monday, October 15, 12
  • 45. Think you know better? Give us a hint > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} ) Monday, October 15, 12
  • 46. Recording slow queries > db.setProfilingLevel( n , slowms=100ms ) n=0 profiler off n=1 record queries longer than slowms n=2 record all queries > db.system.profile.find() Monday, October 15, 12
  • 48. Background index builds db.foo.ensureIndex( { user : 1 } , { background : true } ) Caveats: • still resource-intensive • will build in foreground on secondaries Monday, October 15, 12
  • 49. Minimizing impact on Replica Sets for (s in secondaries) s.restartAsStandalone() s.buildIndex() s.restartAsReplSetMember() s.waitForCatchup() p.stepDown() p.restartAsStandalone() p.buildIndex() p.restartAsReplSetMember() Monday, October 15, 12
  • 50. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem... ...so take some time and get your indexes right! Monday, October 15, 12