13. Datafeeds
{
ProductID : 2309,
Title : “Elephant Leash”,
Brand : “Acme”,
Price : 49.99,
Breadcrumbs : [ “Pets”, “Exotic”, “Accessories” ],
Description : “Horton will love this stylish and functional leash, and
you won’t violate any local statutes when you walk around with the
Acme Elephant Leash!”
}
32. Gotchas
Prototype to Production: ensureIndex() is cheap
ext3 -- banished from the land
oplog size for replication
{number_of_times_the_user_clicked : 1}
33. AFTER PARTY @SLATE
SPECIAL THANKS TO GILT FOR SPONSORING
54 WEST 21st STREET
Editor's Notes
Shopping search engine; crawl the web using AI to aggregate; add data feeds; in-memory search; web front-end
relationship with founders, opportunity, Eliot: final project together, I was playing with QT he wrote a DB and network protocol. Dwight wrote the adserver, code I became highly familiar with on the adserver team.
Largest, write-only
highly utilized
perfect for document oriented architecture, same format as we use to eventually index
browse structure for consumers and SEO, daily updates, live access, cached in front-end
historical note: doubleclick’s imageserver. no brainer to convert backend to avoid maintenance overhead
Prototype: schema extensible, no need for table alters, as in visit table; JSON instead of ORM; joins can be ugly and unpredictable
Prototype: schema extensible, no need for table alters, as in visit table; JSON instead of ORM; joins can be ugly and unpredictable
Prototype: schema extensible, no need for table alters, as in visit table; JSON instead of ORM; joins can be ugly and unpredictable
Prototype: schema extensible, no need for table alters, as in visit table; JSON instead of ORM; joins can be ugly and unpredictable
Many to many joins are missing, but you might not miss them. Storage is cheap, although has consequences for replication; correct for typos with testing
denormalization, but storage is cheap
denormalization, but storage is cheap
date functions missing
with document, can key on alerting, no hunting for last_good_count
it’s easy to roll out code without indices; ext3 is just terrible; big data, 10% of empty too much, custom oplog size, too small; some people using false-ORM to minify attribute labels
it’s easy to roll out code without indices; ext3 is just terrible; big data, 10% of empty too much, custom oplog size, too small; some people using false-ORM to minify attribute labels
it’s easy to roll out code without indices; ext3 is just terrible; big data, 10% of empty too much, custom oplog size, too small; some people using false-ORM to minify attribute labels
it’s easy to roll out code without indices; ext3 is just terrible; big data, 10% of empty too much, custom oplog size, too small; some people using false-ORM to minify attribute labels