2. Why San Francisco?
Learn how others are doing at scale
Learn what problems others have
Learn does their solutions apply to us
Learn does their problems apply to us
3. Why San Francisco?
Silicon Valley based companies:
- Google - Pinterest
- Facebook - Quora
- Twitter - tons of others...
- Netflix
4. NoSQL: Past, Present, Future
Eric Brewer – author of CAP theorem
CP vs. AP but only on time-out (failure)
,
6. Real-time web
node.js – de-facto for real-time web
open connection for user and leave open for him
web sockets are great, but use fallbacks
- mobile devices doesn't support web sockets
- long polling, infinite frame, etc.
more companies moving to SPDY protocol
8. Quora on mobile
first iPhone app
- mobile app is like old app shipped on CD
- hybrid application
- native code for controls and navigation
- HTML for viewing Q&A from the site
- separate mobile optimized HTML layout of the web page
9. Quora on mobile
second Android app
- created clone of iPhone app - failed!
- UI natural on iPhone is alien on Android
- bought Android devices and learned their philosophy
- used new Google Android UI design guidelines
- created new app with native for Android look & feel
- users in India pay per MB, so had to optimize traffic
- optimizations applied for iPhone app and web page
10. Quora on mobile
mobile first experience
- mobile has very unique requirements
- if you're good on mobile, you're good anywhere
- don't use mobile app on tablets, create separate or use web
12. Continuous delivery
Jesse Robbins, author of Chef
infrastructure as code
- full stack automation
- datacenter API (for provisioning VMs, etc.)
- infrastructure is a product and app is a customer
13. Continuous delivery
application as services
- service orientation
- software resiliency
- deep instrumentation
dev / ops as teams
- service owners
- shared metrics / monitoring
- continuous integration / deployment
15. Release engineering at Facebook
Chuck Rossi – release engineering manager
deployment process
- teams are not deploying to production by them selves
- for communication during deployment IRC is used
- if team member is not connected to IRC, release is skipped
- BitTorrent for deployments
- powerful app monitoring and profiling (instrumentation)
16. Release engineering at Facebook
deployment process
- ability to release on subset of servers
- very powerful feature flag mechanism by IP gender, age, …
,
- karma points for developers with down-vote button
facebook.com
- continuously deployed internally
- employees always access latest facebook.com
- easy to report bug from the internal facebook.com
18. Scaling Pintereset
everything in Amazon cloud
before
- had every possible ‘hot’ technology including MySQL,
Cassandra, Mongo, Redis, Memcached, Membase, Elastic
Search – FAIL
- keep it simple, major re-architecting in late 2011
21. Scaling Pintereset
schemeless DB design
- no foreign keys
- no joins
- denormalized data (id + JSON data)
- users, user_has_boards, boards, board_has_pins, pins
- read slaves
- heavy use of cache for speed & better consistency
thinking of moving to their own DC
23. Architectural patterns for HA
Adrian Cockcroft – director of architecture at Netflix
architecture
- everything in Amazon cloud in 3 availability zones
- chaos Gorilla, latency Gorilla
- service-based architecture, stateless micro-services
- high attention for service resilience
- handle dependent service unavailability or increased latency
started open-sourcing to improve quality of the code
24. Architectural patterns for HA
Cassandra usage
- 2 dedicated Cassandra teams
- over 50 Casssandra clusters, over 500 nodes, over 30 TB of
data, biggest cluster has 72 nodes
- most write operations, for reads Memcache layer is used
- moved to SSD in Amazon instead of spinning disks and cache
- for ETL: read Cassandara backup files using Hadoop
- can scale zero-to-500 instances in 8 minutes
26. Timelines at scale
Raffi Krikorian – director of Twiter's platform services
core architecture
- pull (timeline & search) and push (mobile, streams) use-cases
- 300K QPS for timeline
- on write use fan-out process to copy data for each use-case
- timeline cache in Redis
- when you tweet and you have 200 followers there will be 200
inserts to each follower timeline
27. Timelines at scale
core architecture
- Hadoop for batch compute and recommendation
- code heavily instrumented (load times, latencies, etc.)
- uses Cassandra, but moving off from it due to read times
28. More info
Slides - http://qconsf.com/sf2012
Videos - http://www.infoq.com/