Flashback: QCon San Francisco 2012

Why San Francisco?

Learn how others are doing at scale

Learn what problems others have

Learn does their solutions apply to us

Learn does their problems apply to us

Why San Francisco?

Silicon Valley based companies:

- Google - Pinterest

- Facebook - Quora

- Twitter - tons of others...

- Netflix

NoSQL: Past, Present, Future
Eric Brewer – author of CAP theorem

CP vs. AP but only on time-out (failure)
,

Real-time web
node.js – de-facto for real-time web

open connection for user and leave open for him

web sockets are great, but use fallbacks

- mobile devices doesn't support web sockets

- long polling, infinite frame, etc.

more companies moving to SPDY protocol

Quora on mobile
first iPhone app

- mobile app is like old app shipped on CD

- hybrid application

- native code for controls and navigation

- HTML for viewing Q&A from the site

- separate mobile optimized HTML layout of the web page

Quora on mobile
second Android app

- created clone of iPhone app - failed!

- UI natural on iPhone is alien on Android

- bought Android devices and learned their philosophy

- used new Google Android UI design guidelines

- created new app with native for Android look & feel

- users in India pay per MB, so had to optimize traffic

- optimizations applied for iPhone app and web page

Quora on mobile
mobile first experience

- mobile has very unique requirements

- if you're good on mobile, you're good anywhere

- don't use mobile app on tablets, create separate or use web

Continuous delivery
Jesse Robbins, author of Chef

infrastructure as code

- full stack automation

- datacenter API (for provisioning VMs, etc.)

- infrastructure is a product and app is a customer

Continuous delivery
application as services

- service orientation

- software resiliency

- deep instrumentation

dev / ops as teams

- service owners

- shared metrics / monitoring

- continuous integration / deployment

Release engineering at Facebook
Chuck Rossi – release engineering manager

deployment process

- teams are not deploying to production by them selves

- for communication during deployment IRC is used

- if team member is not connected to IRC, release is skipped

- BitTorrent for deployments

- powerful app monitoring and profiling (instrumentation)

Release engineering at Facebook
deployment process

- ability to release on subset of servers

- very powerful feature flag mechanism by IP gender, age, …
,

- karma points for developers with down-vote button

facebook.com

- continuously deployed internally

- employees always access latest facebook.com

- easy to report bug from the internal facebook.com

Scaling Pintereset
everything in Amazon cloud

before

- had every possible ‘hot’ technology including MySQL,
Cassandra, Mongo, Redis, Memcached, Membase, Elastic
Search – FAIL

- keep it simple, major re-architecting in late 2011

Scaling Pintereset
January 2012

- Amazon EC2 + S3 + Akamai, ELB

- 90 Web Engines + 50 API Engines

- 66 sharded MySQL DBs + 66 slave replicas

- 59 Redis

- 51 Memcache

- 1 Redis task queue + 25 task processors

- sharded Solr

- 6 engineers

Scaling Pintereset
now

- Amazon EC2 + S3 + Akamai, Level3, EdgeCast, ELB

- 180 Web Engines + 240 API Engines

- 80 sharded MySQL DBs + 80 slave replicas

- 110 Redis

- 200 Memcache

- 4 Redis task queues + 80 task processors

- sharded Solr

- 40 engineers

Scaling Pintereset
schemeless DB design

- no foreign keys

- no joins

- denormalized data (id + JSON data)

- users, user_has_boards, boards, board_has_pins, pins

- read slaves

- heavy use of cache for speed & better consistency

thinking of moving to their own DC

Architectural patterns
for high availability at

Architectural patterns for HA
Adrian Cockcroft – director of architecture at Netflix

architecture

- everything in Amazon cloud in 3 availability zones

- chaos Gorilla, latency Gorilla

- service-based architecture, stateless micro-services

- high attention for service resilience

- handle dependent service unavailability or increased latency

started open-sourcing to improve quality of the code

Architectural patterns for HA
Cassandra usage

- 2 dedicated Cassandra teams

- over 50 Casssandra clusters, over 500 nodes, over 30 TB of
data, biggest cluster has 72 nodes

- most write operations, for reads Memcache layer is used

- moved to SSD in Amazon instead of spinning disks and cache

- for ETL: read Cassandara backup files using Hadoop

- can scale zero-to-500 instances in 8 minutes

Timelines at scale
Raffi Krikorian – director of Twiter's platform services

core architecture

- pull (timeline & search) and push (mobile, streams) use-cases

- 300K QPS for timeline

- on write use fan-out process to copy data for each use-case

- timeline cache in Redis

- when you tweet and you have 200 followers there will be 200
inserts to each follower timeline

Timelines at scale
core architecture

- Hadoop for batch compute and recommendation

- code heavily instrumented (load times, latencies, etc.)

- uses Cassandra, but moving off from it due to read times

More info

Slides - http://qconsf.com/sf2012

Videos - http://www.infoq.com/

Flashback: QCon San Francisco 2012

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Flashback: QCon San Francisco 2012

Ähnlich wie Flashback: QCon San Francisco 2012 (20)

Mehr von Sergejus Barinovas

Mehr von Sergejus Barinovas (14)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Flashback: QCon San Francisco 2012