Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 50 Anzeige

Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Herunterladen, um offline zu lesen

This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.

This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Anzeige

Ähnlich wie Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes (20)

Weitere von MongoDB (20)

Anzeige

Aktuellste (20)

Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

  1. 1. MongoDBEurope2016 Old Billingsgate, London 15th November Use my code JD20 for 20% off tickets mongodb.com/europe
  2. 2. Back to Basics 2016 : Webinar 4 Advanced Indexing – Text and Geospatial Indexes Joe Drumgoole Director of Developer Advocacy, EMEA @jdrumgoole V1.1
  3. 3. 3 Recap • Webinar 1 – Introduction to NoSQL – The different types of NoSQL databases – What kind of database is MongoDB? A document database. • Webinar 2 – My First Application – Creating databases and collections – CRUD operations – Indexes and Explain • Webinar 3 – Schema Design – Dynamic schema – Embedding approaches – Examples
  4. 4. 4 Indexing • An efficient way to look up data by its value • Avoids table scans 1 2 3 4 5 6 7
  5. 5. 5 Traditional Databases Use Btrees • … and so does MongoDB
  6. 6. 6 Queries, Inserts, Deletes O(Log(n) Time
  7. 7. 7 Creating a Simple Index db.coll.createIndex( { fieldName : <Direction> } ) Database Name Collection Name Command Field Name to be indexed Ascending : 1 Descending : -1
  8. 8. 8 Two Other Kinds of Indexes • Full Text Index – Allows searching inside the text of a field ( Lucene, Solr and Elastic Search) • Geospatial Index – Allows searching by location (e.g. people near me) • These indexes do not use Btrees
  9. 9. 9 Full Text Indexes • An “inverted index” on all the words inside a single field (only one text index per collection) { “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” } >> db.posts.createIndex( { “comments” : “text” } ) MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ) { "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" } MongoDB Enterprise >
  10. 10. 10 Results MongoDB Enterprise > db.posts.getIndexes() ... { "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "comment_text", "ns" : "test.posts", "weights" : { "comment" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 }
  11. 11. 11 Dropping Text Indexes • We drop text indexes by name rather than shape db.posts.getIndexes() { "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "comment_text_text", "ns" : "test.posts", "weights" : { "comment" : 5, "tags" : 10 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 }
  12. 12. 12 Hence MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" ) { "nIndexesWas" : 2, "ok" : 1 } MongoDB Enterprise > • You can give an index an explict name to make this easier MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" : "text" }, { "name" : "text_index" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  13. 13. 13 On The Server I INDEX [conn275] build index on: test.posts properties: { v: 1, key: { _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts", weights: { comment: 1 }, default_language: "english", language_override: "language", textIndexVersion: 3 }} I INDEX [conn275] building index using bulk method I INDEX [conn275] build index done. scanned 3 total records. 0 secs
  14. 14. 14 More Detailed Example >> db.posts.insert( { "comment" : "Red yellow orange green" } ) >> db.posts.insert( { "comment" : "Pink purple blue" } ) >> db.posts.insert( { "comment" : "Red Pink" } ) >> db.posts.find( { "$text" : { "$search" : "Red" }} ) { "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" } { "_id" : ObjectId( »…"), "comment" : "Red Pink" } >> db.posts.find( { "$text" : { "$search" : "Red Green" }} ) { "_id" : ObjectId(« …"), "comment" : "Red Pink" } { "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" } >> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve { "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" } { "_id" : ObjectId(«…”), "comment" : "Red Pink" } >>
  15. 15. 15 Using Weights • We can assign different weights to different fields in the text index • E.g. I want to favour tags over comments in searching • So I increase the weight for the the tags field >> db.blog.createIndex( { comment: "text", tags : "text” }, { weights: { comment: 5, tags : 10 }} ) • Now searches will favour tags
  16. 16. 16 $textscore • Weights impact $textscore: >> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: { $meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } ) { "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score" : 6.666666666666666 } { "_id" : …, "comment" : "Red Pink", "score" : 3.75 } { "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 } >>
  17. 17. 17 Other Parameters • Language : Pick the language you want to search in e.g. – $language : Spanish • Support case sensitive searching – $caseSensitive : True (default false) • Support accented characters (diacritic sensitive search e.g. café is distinguished from cafe ) – $diacriticSensitive : True (default false)
  18. 18. Geospatial Indexes
  19. 19. 19 Geospatial Indexes • MongoDB supports 2D Sphere indexes • Allows a user to represent location on the earth (which is a sphere) • Coordinates are stored in GeoJSON format • The Geospatial index supports subset of the GeoJSON operations • The index is based on a QuadTree representation • Index is based on WGS 84 standard
  20. 20. 20 Coordinates • Coordinates are represented as longitude, latitude • longitude – Measured from Greenwich meridian in London (0 degrees) locations east (up to 180 degrees) – For locations west we specify as negative • Latitude – Measured from equator north and south (0 to 90 north, 0 to -90 south) • Coordinates in MongoDB are stored on Longitude/Latitude order • Coordinates in Google are stored in Latitude/Longitude order
  21. 21. 21 2DSphere Versions • Three versions of 2dSphere index in MongoDB • Version 1 : Up to MongoDB 2.4 • Version 2 : From MongoDB 2.6 onwards • Version 3 : From MongoDB 3.2 onwards • We will only be talking about Version 3 in this webinar
  22. 22. 22 Creating a 2dSphere Index db.collection.createIndex ( { <location field> : "2dsphere" } ) • Location field must be coordinate or GeoJSON data
  23. 23. 23 Example >> db.test.createIndex( { loc : "2dsphere" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  24. 24. 24 Output >> db.test.getIndexes() [ { "v" : 1, "key" : { "loc" : "2dsphere" }, "name" : "loc_2dsphere", "ns" : "geo.test", "2dsphereIndexVersion" : 3 } ] >>
  25. 25. 25 Use a Simple Dataset to investigate Geo Queries • Lets search for restaurants in Manhattan • Using two candidate collections – https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json – https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json • Import them into MongoDB – mongoimport –c neighborhoods –d geo neighborhoods.json – mongoimport –c restaurants –d geo restaurants.json
  26. 26. 26 Neighborhood Document MongoDB Enterprise > db.neighborhoods.findOne() { "_id" : ObjectId("55cb9c666c522cafdb053a1a"), "geometry" : { "coordinates" : [ [ [ -73.94193078816193, 40.70072523469547 ], ... [ -73.94409591260093, 40.69897295461309 ], ] "type" : "Polygon" }, "name" : "Bedford" }
  27. 27. 27 Restaurant Document MongoDB Enterprise > db.restaurants.findOne() { "_id" : ObjectId("55cba2476c522cafdb053adf"), "location" : { "coordinates" : [ -73.98241999999999, 40.579505 ], "type" : "Point" }, "name" : "Riviera Caterer" } MongoDB Enterprise > You can type this into google maps but remember to reverse the coordinate order
  28. 28. 28 Add Indexes MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise >
  29. 29. 29 Use $geoIntersects to find our Neighborhood • Assume we are at -73.93414657, 40.82302903 • What neighborhood are we in? Use $geoIntersects db.neighborhoods.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -73.93414657, 40.82302903 ]}}}})
  30. 30. 30 Results { "geometry" : { ”coordinates" : [ [ -73.9338307684026, 40.81959665747723 ], ... [ -73.93383000695911, 40.81949109558767 ] ] "type" : "Polygon" }, "name" : "Central Harlem North-Polo Grounds" }
  31. 31. 31 Find All Restaurants within 0.35 km db.restaurants.find({ location: { $geoWithin: { $centerSphere: [ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] } } }) Distance in km Divide by radius of earth to convert to radians
  32. 32. 32 Results – (Projected) { "name" : "Gotham Stadium Tennis Center Cafe" } { "name" : "Chuck E. Cheese'S" } { "name" : "Red Star Chinese Restaurant" } { "name" : "Tia Melli'S Latin Kitchen" } { "name" : "Domino'S Pizza" } • Without projection { "_id" : ObjectId("55cba2476c522cafdb0550aa"), "location" : { "coordinates" : [ -73.93795159999999, 40.823376 ], "type" : "Point" }, "name" : "Domino'S Pizza" }
  33. 33. 33 Summary of Operators • $geoIntersect: Find areas or points that overlap or are adjacent • $geoWithin: Find areas on points that lie within a specific area • $geoNear: Returns locations in order from nearest to furthest away
  34. 34. 34 Summary • Text Indexes : Full text searching of all the text items in a collection • Geospatial Indexes : Search by location, by intersection or by distance from a point
  35. 35. 35 Q & A
  36. 36. 37 • This is slide content
  37. 37. 41
  38. 38. 42
  39. 39. LOREM IPSUM LOREM IPSUM LOREM IPSUM LOREM IPSUM Sollicitudin VenenatisLOREM IPSUM LOREM IPSUM LOREM IPSUM LOREM IPSUM Graphic Element Examples
  40. 40. Porta Ultricies Commodo Porta Graph Examples 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2
  41. 41. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2
  42. 42. { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } Code/Highlight Example
  43. 43. Aggregation Framework Agility Backup Big Data Briefcase Buildings Business Intelligence Camera Cash Register Catalog Chat Checkmark Checkmark Cloud Commercial Contract Computer Content Continuous Development Credit Card Customer Success
  44. 44. Data Center Data Variety Data Velocity Data Volume Data Warehouse Database Dialogue Directory Documents Downloads Drivers Dynamic Schema EDW Integration Faster Time to Market File Transfer Flexible Gear Hadoop Health Check High Availability Horizontal Scaling Integrating into Infrastructure Internet of Things Iterative Development
  45. 45. Life Preserver Line Graph Lock Log Data Lower Cost Magnifying Glass Man Mobile Phone Meter Monitoring Music New Apps New Data Types Online Open Source Parachute Personalization Pin Platform Certification Product Catalog Puzzle Pieces RDBMS Realtime Analytics Rich Querying
  46. 46. Life Preserver RSS Scalability Scale Secondary Indexing Steering Wheel Stopwatch Text Search Tick Data Training Transmission Tower Trophy Woman World

Hinweis der Redaktion

  • Who I am, how long have I been at MongoDB.
  • Each item in a Btree node points to a sub-tree containing elements below its key value. Insertions require a read before a write. Writes that split nodes are expensive.
  • Effectively the depth of the tree.
  • Production release numbering.
  • Visit Map to show location.
  • Show Riviera on Google Maps.

×