45. #2 find by location
Businesses in San Francisco, CA
46. // find all within state
db.businesses.find({
"location.state._id": ObjectId("4ce82937961552247900000f")
})
find businesses by state/city/zip
47. // find all within state
db.businesses.find({
"location.state._id": ObjectId("4ce82937961552247900000f")
})
// find all within city
db.businesses.find({
"location.city._id": ObjectId("4ce82aa0d3dfaa10f8004a95")
})
find businesses by state/city/zip
48. // find all within state
db.businesses.find({
"location.state._id": ObjectId("4ce82937961552247900000f")
})
// find all within city
db.businesses.find({
"location.city._id": ObjectId("4ce82aa0d3dfaa10f8004a95")
})
// find all within zip
db.businesses.find({
"location.zip._id": ObjectId("4ce82b5ed3dfaa116b0026f0")
})
find businesses by state/city/zip
50. #3 find by category
Businesses in the Auto Repair category
51. // find by category id
db.businesses.find({
"categories._id": ObjectId("4ce82e50d3dfaa16360004f2")
})
// the index
db.businesses.ensureIndex({
"categories._id":1
})
businesses by category
52. #4 - find by category + location
Businesses in the Plumbing category in Chicago, IL
53. // find by city id and category id
db.businesses.find({
"location.city._id": ObjectId("4ce82aa0d3dfaa10f8004a95"),
"categories._id": ObjectId("4ce82e50d3dfaa16360004f2")
})
businesses by category + city
54. // city id
{"location.city._id":1}
~ or ~
// category id
{"categories._id":1}
answer: both suck
we need a compound index
which index should we use?
55. db.businesses.ensureIndex({
"location.city._id" : 1, "categories._id" : 1
})
~ or ~
db.businesses.ensureIndex({
"categories._id" : 1, "location.city._id" : 1
})
35,000 cities & 2,500 categories
answer: cities → categories
create one for zip codes and categories too!
which order?
56. {"location.city._id" : 1}
{"location.city._id" : 1, "categories._id" : 1}
answer: yes
db.businesses.dropIndex("location.city._id_1")
don’t we have 2 indexes on city id?
57. #5 - find by keyword
“something awesome” in Boulder, CO
59. me: we’re switching from postgres+solr to mongo
kyle: oh wow, you can replace solr with mongo?
me: with some creativity
kyle: seems like it’d still be hard to get just right
me: it works well
kyle: gotcha
chat with Kyle Banker
64. • xml files containing each unique url ~ 24M
• 50,000 urls per file, about 500 files
• urls are generated from live data
• http://companyx.com/sitemaps/1.xml
sitemaps
65. >> "hello!".hash % 6 #=> 5
>> "/ny/new-york/c/apartments".hash % 6 #=> 5
returns an integer between 0 and the
number specified
partition by consistent hash
66. 1. map each url in the site to a partition
2. reduce all partitions to a single document containing
all urls in that partition
3. save to a permanent collection
map/reduce