SlideShare ist ein Scribd-Unternehmen logo
1 von 70
Downloaden Sie, um offline zu lesen
Artur Bergman
          sky@crucially.net
• Wikia Inc
  – We are hiring
  – Community/Bizdev in Germany
  – Engineers in Poland
  – http://www.wikia.com/wiki/hiring
• O’Reilly Radar
  – http://radar.oreilly.com/artur/
The value of operations
•   Google
•   Orkut
•   Friendster
•   Myspace
Benefits
•   Users trust your brand
•   They rely on you
•   They spend more time on your site
•   Bad operations wastes R&D money

• Fixed amount of time + faster site =
  more page views
Stepchild of Engineering
• Product development
• Engineering
• Operations
  – Sysadmins?
• Why?
Operations Engineering
• It is engineering
• Google terminology -
  – Site Reliability Engineer
• Sure there are sysadmins too, people
  mananing NOCs and datacenters
• Provide career growth
Good Engineers
•   Detail Oriented
•   Aspire to be operational engineers
•   Stubborn
•   Can steer their inner ADD
    – Interrupt driven
• Not the same as good developers
Danger signs
• Thinks operation is a path to
  development engineering
  – Fire them
• Want people dedicated to the task
• A good operations engineer should
  spend some time in development
• A good development engineer MUST
  spend some time in operations
Debugging
• 9 Rules of debugging
• http://www.debuggingrules.com/Poster_
  download.html
  – Yes the font is horrible
Rule 1:
       Understand the system
•   Complexity Kills
•   No excuse
•   If you write it, you must know it
•   If you run it, you must know it
•   If you buy it, you must know it
Rule 3:
      Quit thinking and look
• quot;It is a capital mistake to theorize before
  one has data. Insensibly one begins to
  twist facts to suit theories, instead of
  theories to suit facts.”
Rule 3:
        Quit thinking and look
•   What do you look at?
•   The importance of monitoring
•   Monitoring
•   Monitoring
•   Monitoring
My my, confusing term
• Monitoring
• Alerting
• Trending
Monitoring
•   Collects data
•   Puts into databases
•   Makes it available for you
•   Active collection
•   Passive interaction
Alerting
• Acts on monitoring data
• Severe alerts
  – Active
  – Needs action
• Passive alerts
  – Things that need to be done but not right now
• DO NOT OVER ALERT
• DO NOT CRY WOLF
Wikia alerting strategy
•   When the site is slow
•   Or down
•   We send emails and do phone calls
•   Europe and US West coast
•   Looking to hire in East Asia
•   No night time
Trending
• Long term
• Capacity planning
Monitor Tools
•   Nagios
•   Cacti
•   MRTG
•   Hyperic
•   Cricket
•   Ganglia
External Monitoring
• Use one, tells you what your clients see
  every x minutes
• Keynote
• Gomez
• Websitepulse (cheap - easy - I like
  them; no annoying salesforce)
Nagios
•   Alerting
•   Hassle
•   C CGI??
•   Doesn’t
    scale
Hyperic
• Most exciting open source tool
• Agent base - self configured
• Baseline alerting
Cricket MRTG Cacti
• Impossible to configure
• You need to write tools to do it
• Especially Cacti
  – Somewhat more pleasant than clawing out
    your eyes
Ganglia
• We love ganglia
• Automatically graphs everything you
  want - just works
• Large scale clusters
• Multicast
• Zero config
• RRD
http://ganglia.wikimedia.org/
•   270 hosts
•   880 CPU
•   2 clusters
•   1.2 TB of Memory
http://ganglia.wikimedia.org
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Or Learn Unix

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Something is wrong

• Don’t worry, data warehouse




                      QuickTime™ and a
            TIFF (Uncompressed) decompressor
               are needed to see this picture.
tcpdump / waveshark
•   If you suspect the network
•   Don’t just suspect
•   LOOK AT IT
•   Tcpdump / waveshark will tell you
    – If your packets are lost, delayed or
      corrupted
    – Your windowing is wrong
Rule 4: Divde and Conquer
• Look at the problems in turn
• Split between people
• Go in the order you suspect is the most
  likely
Rule 5:
 Change one thing at a time
• I cannot stress this enough
• IF YOU DO NOT THEN YOU HAVE
  FAILED TO IDENTIFY THE PROBLEM
Rule 6:
        Keep an audit trail
• You might be making things worse
• Good for the root cause analysis
• Have your shell log all commands
  – Good practice anyway
• Version control
Rule 9:
    If you didn’t fix it, it ain’t fixed
•   You must do something to fix a problem
•   Or it will bite you again
•   And again
•   And again
•   They don’t just appear and disappear
•   Except BGP route convergence :)
Process
• You need a little
• Don’t worry
Don’t forget
Complexity kills
•   Design against it
•   Reuse components
•   Define standards
•   Have a few images that all machines
    look like - reimage machines every now
    and then for the heck of it.
    – EC2 forces you to do this
MTBF
Meduim Time Between Failure
• Actually mostly irrelevant
• Dealing with failure is more important
• Target the right uptime
  – Complexity scales exponatially with
    required uptime
• Don’t kid yourself, you don’t need 5
  nines
MTTR
  Medium Time To Recovery
• Important
• Noone cares if you fail once a minute
  – If you recover in 50 ms
• If you are down 1 minute a week, you
  are still going to hit 4 nines (99.99%)
• Failures happen, plan how to deal with
  them
Problem found
• If it is critical, start a phone conversation
• Use IRC to communicate technical data
• One person liasons with non technical
  staff
• One person specifically in command
• Sleep scheduling ( audit log important )
Post crisis
• Root cause analysis
  – Just find out what went wrong
  – And how to avoid it
  – Or fix it faster next time if you can’t
• Keep track of your uptime
Automation
•   All machines are created equal
•   Seriously
•   If you manually make changes
•   You are wrong
    – Unless you know what you are doing
Best practices
•   Version control
•   Gold images
•   Centralised authentication
•   Time Sync ( NTP )
•   Central logging
•   ( All of this applies for virtual machines
    too!)
cfengine
•   Standard automation tool
•   Written in C
•   Not much support
•   Very good
•   Very annoying
contro :
      l
  s te
   i      = ( mys te )
                 i        domain = (
  mysite .count y )
               r
  sysadm = (mark )          netmask = (
  255.255.255.0 )          ac i
                             t onsequence =
  (         mounta ll       mount nfo
                                  i
      addmounts          mounta l
                                l        lnks
                                          i
  )        mountpat rn = / ie) (
                      te     $(s t /$ host))
 homepat r = ( u? )
          te n
Puppet
•   New hip kid on the block
•   Written in ruby
•   Better support?
•   Much nicer syntax
•   Easier to extend
def ne yumrepo (enab
   i                 led = true)
{c i i
    onf gfle
{ /e c
 quot; t /yum.repos /
               .d $name.repo”: mode
  => 644,
source => quot; yum/repos
             /        /$name. repoquot;,
ensure => $enab led ? {
true => fl ,
         ie
defau t=> absent
      l                  }
}}
cobb er
                        l
• Automatic PXE Installer
    – Uses kickstart files
•   Redhat Enterprise
•   Centos
•   Fedora
•   Some support for debian
cobbler
cobbler system add
  --name=xen8
  --mac=00:19:B9:EE:6D:0A
  --ip=10.10.30.208
  --profile=Centos-5-x86_64
  --kopts='ksdevice=00:19:B9:EE:6D:0A
      console=ttyS1,57600 console=tty0'
cobbler
cobbler system add
  --name=xen8
  --mac=00:19:B9:EE:6D:0A
  --ip=10.10.30.208
  --profile=Centos-5-x86_64
  --kopts='ksdevice=00:19:B9:EE:6D:0A
      console=ttyS1,57600 console=tty0’
koan
• Client install tool
  – Xen
  – Or OS re-image


koan --server=10.10.30.205 --virt --
  profile=virt_fc6 --virt-name=otrs
Your datacenter
• Keep it tidy
   – Label things, keep cables as short as possible
   – Have a switch in each rack
• If you are small without dedicated DC staff
  you need
   – Remote control power switches
   – Remote console!
Virtualization
•   Please use it
•   Managing becomes much easier
•   Power consumption
•   Need a new test box
    – The requestor can have it in minutes
Power consumption
• Maybe not as important in Europe
• 8 core machines are more efficient than
  1 core
• But memcache uses 1 core and all RAM
• Get more RAM and virtualise
Our network admin boxes
•   1 Xen CPU for Vyatta
•   1 Xen CPU for LVS
•   1 Xen CPU for Squid - Carp
•   1 Xen CPU for Squid
•   1 Xen CPU for Monitoring
•   1 Xen CPU for network tasks

• We can have more of these and a loss of one
  affects us less
Vyatta
• Opensource router
  – Really like it
  – No need to use Cisco
LVS
•   Linux Virtual Server
•   Low level load balancer
•   HA
•   Fast
•   Doesn’t inspire people to put things in
    the only place that is hard to scale
Squid Carp
• Squids configured to hash the urls and
  send them to specific backend
• Very little configuration done
• Logging of UDP - no disk IO
Squid
• As a reverse web accelerator
• 90 % of our hits served from RAM in less than
  1 ms
• Same as wikipedia
• We only use RAM cache ( unlike wikipedia)
• Cached per user
• If not cacheable - cache for a second to
  redue backend effect
App servers
• 1 xen cpu for memcache ( 5 GB Ram)
• 1 xen cpu for squid ( 5GB Ram )
• 6 xen cpus for apache (6 GB Ram )

• More power efficient, less affected by
  loss
• Applications can’t affect each other
Databases
• Keep developers on short leash
• Report bad queries
• Fear object relational mappers
Outsourcing
• As much as possible
• The younger you are as a company the
  less risk
  – When you have no users, you have no
    value
• VCs don’t like having their money go
  into Capex
What I want from Vendors
• They do what they tell me
• They do what I tell them

• No annoying up sells, no premium
  services
  – I know more about what you are selling
    than you
Services we use
• Amazon EC2 and S3
• Panther-Express
Panther Express
• Fantastic Content Distribution Network
• Cheap, simple price list
  – Take note akamai
• Cut delivery time to Europe by 70%
• We let our images be cached 1 second
  to redue load
EC2 and S3
•   We save all our binlogs to S3
•   We save database dumps to S3
•   We have monitors running from EC2
•   We plan to build a datawarehouse
    cluster on EC2
EC2 Requires Automation
• Machine is blank when you bring it up
• Download database dump from S3 and
  replicate up - automatically
• Use puppet
• Amazon saves you hardware
  headaches
  – But complexity is still a problem
Thank you

Weitere ähnliche Inhalte

Andere mochten auch

Web engineering - An overview about HTML
Web engineering -  An overview about HTMLWeb engineering -  An overview about HTML
Web engineering - An overview about HTMLNosheen Qamar
 
Web Engineering - Web Application Testing
Web Engineering - Web Application TestingWeb Engineering - Web Application Testing
Web Engineering - Web Application TestingNosheen Qamar
 
Web application testing with Selenium
Web application testing with SeleniumWeb application testing with Selenium
Web application testing with SeleniumKerry Buckley
 
Web App Testing - A Practical Approach
Web App Testing - A Practical ApproachWeb App Testing - A Practical Approach
Web App Testing - A Practical ApproachWalter Mamed
 
Testing Web Applications
Testing Web ApplicationsTesting Web Applications
Testing Web ApplicationsSeth McLaughlin
 
Web Application Testing
Web Application TestingWeb Application Testing
Web Application TestingRicha Goel
 
Selenium Testing Project report
Selenium Testing Project reportSelenium Testing Project report
Selenium Testing Project reportKapil Rajpurohit
 
Software testing basic concepts
Software testing basic conceptsSoftware testing basic concepts
Software testing basic conceptsHĆ°ng HoĂ ng
 
Testing concepts ppt
Testing concepts pptTesting concepts ppt
Testing concepts pptRathna Priya
 
Software Testing Fundamentals
Software Testing FundamentalsSoftware Testing Fundamentals
Software Testing FundamentalsChankey Pathak
 

Andere mochten auch (12)

Web engineering - An overview about HTML
Web engineering -  An overview about HTMLWeb engineering -  An overview about HTML
Web engineering - An overview about HTML
 
Web Engineering - Web Application Testing
Web Engineering - Web Application TestingWeb Engineering - Web Application Testing
Web Engineering - Web Application Testing
 
Web application testing with Selenium
Web application testing with SeleniumWeb application testing with Selenium
Web application testing with Selenium
 
Web App Testing - A Practical Approach
Web App Testing - A Practical ApproachWeb App Testing - A Practical Approach
Web App Testing - A Practical Approach
 
Testing Web Applications
Testing Web ApplicationsTesting Web Applications
Testing Web Applications
 
Web Application Testing
Web Application TestingWeb Application Testing
Web Application Testing
 
Testing web application
Testing web applicationTesting web application
Testing web application
 
Selenium Testing Project report
Selenium Testing Project reportSelenium Testing Project report
Selenium Testing Project report
 
Software testing basic concepts
Software testing basic conceptsSoftware testing basic concepts
Software testing basic concepts
 
Testing concepts ppt
Testing concepts pptTesting concepts ppt
Testing concepts ppt
 
Software Testing Fundamentals
Software Testing FundamentalsSoftware Testing Fundamentals
Software Testing Fundamentals
 
Software testing ppt
Software testing pptSoftware testing ppt
Software testing ppt
 

Ă„hnlich wie Web 2.0 Performance and Reliability: How to Run Large Web Apps

Make Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMake Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMySQLConference
 
Drizzle Talk
Drizzle TalkDrizzle Talk
Drizzle TalkBrian Aker
 
How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08kingsfleet
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little PiecesAndrei Zmievski
 
Tips on High Performance Server Programming
Tips on High Performance Server ProgrammingTips on High Performance Server Programming
Tips on High Performance Server ProgrammingJoshua Zhu
 
The Automation Factory
The Automation FactoryThe Automation Factory
The Automation FactoryNathan Milford
 
Understanding and hiding your operations
Understanding and hiding your operationsUnderstanding and hiding your operations
Understanding and hiding your operationsDaniel López Jiménez
 
The Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With RubyThe Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With Rubymattmatt
 
Nevmug Lighthouse Automation7.1
Nevmug   Lighthouse   Automation7.1Nevmug   Lighthouse   Automation7.1
Nevmug Lighthouse Automation7.1csharney
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth PresentationLong Nguyen
 
Practical project automation
Practical project automationPractical project automation
Practical project automationReinout van Rees
 
Securing Rails
Securing RailsSecuring Rails
Securing RailsAlex Payne
 
Os Wilhelm
Os WilhelmOs Wilhelm
Os Wilhelmoscon2007
 
Secure Programming With Static Analysis
Secure Programming With Static AnalysisSecure Programming With Static Analysis
Secure Programming With Static AnalysisConSanFrancisco123
 
When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)Nate Lawson
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashFabio Pignatti
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcachedelliando dias
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
 
Tools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance IssuesTools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance IssuesClaudio Miranda
 

Ă„hnlich wie Web 2.0 Performance and Reliability: How to Run Large Web Apps (20)

Make Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMake Your Life Easier With Maatkit
Make Your Life Easier With Maatkit
 
Drizzle Talk
Drizzle TalkDrizzle Talk
Drizzle Talk
 
How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little Pieces
 
Tips on High Performance Server Programming
Tips on High Performance Server ProgrammingTips on High Performance Server Programming
Tips on High Performance Server Programming
 
Becoming a Power User
Becoming a Power UserBecoming a Power User
Becoming a Power User
 
The Automation Factory
The Automation FactoryThe Automation Factory
The Automation Factory
 
Understanding and hiding your operations
Understanding and hiding your operationsUnderstanding and hiding your operations
Understanding and hiding your operations
 
The Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With RubyThe Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With Ruby
 
Nevmug Lighthouse Automation7.1
Nevmug   Lighthouse   Automation7.1Nevmug   Lighthouse   Automation7.1
Nevmug Lighthouse Automation7.1
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
 
Practical project automation
Practical project automationPractical project automation
Practical project automation
 
Securing Rails
Securing RailsSecuring Rails
Securing Rails
 
Os Wilhelm
Os WilhelmOs Wilhelm
Os Wilhelm
 
Secure Programming With Static Analysis
Secure Programming With Static AnalysisSecure Programming With Static Analysis
Secure Programming With Static Analysis
 
When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crash
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcached
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
 
Tools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance IssuesTools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance Issues
 

Mehr von adunne

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overviewadunne
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Previewadunne
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networksadunne
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigationadunne
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Communityadunne
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spideradunne
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Previewadunne
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solradunne
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industryadunne
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centersadunne
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...adunne
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Dataadunne
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...adunne
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacyadunne
 
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data SetUnder the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Setadunne
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approachesadunne
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketingadunne
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storageadunne
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibilityadunne
 

Mehr von adunne (20)

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overview
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Preview
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networks
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigation
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Community
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spider
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Preview
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industry
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centers
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Data
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacy
 
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data SetUnder the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketing
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storage
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibility
 

KĂĽrzlich hochgeladen

8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxFinancial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxsaniyaimamuddin
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Chapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditChapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditNhtLNguyn9
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africaictsugar
 
Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!
Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!
Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!Doge Mining Website
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 

KĂĽrzlich hochgeladen (20)

8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxFinancial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Chapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal auditChapter 9 PPT 4th edition.pdf internal audit
Chapter 9 PPT 4th edition.pdf internal audit
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africa
 
Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!
Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!
Unlocking the Future: Explore Web 3.0 Workshop to Start Earning Today!
 
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 

Web 2.0 Performance and Reliability: How to Run Large Web Apps

  • 1. Artur Bergman sky@crucially.net • Wikia Inc – We are hiring – Community/Bizdev in Germany – Engineers in Poland – http://www.wikia.com/wiki/hiring • O’Reilly Radar – http://radar.oreilly.com/artur/
  • 2. The value of operations • Google • Orkut • Friendster • Myspace
  • 3. Benefits • Users trust your brand • They rely on you • They spend more time on your site • Bad operations wastes R&D money • Fixed amount of time + faster site = more page views
  • 4. Stepchild of Engineering • Product development • Engineering • Operations – Sysadmins? • Why?
  • 5. Operations Engineering • It is engineering • Google terminology - – Site Reliability Engineer • Sure there are sysadmins too, people mananing NOCs and datacenters • Provide career growth
  • 6. Good Engineers • Detail Oriented • Aspire to be operational engineers • Stubborn • Can steer their inner ADD – Interrupt driven • Not the same as good developers
  • 7. Danger signs • Thinks operation is a path to development engineering – Fire them • Want people dedicated to the task • A good operations engineer should spend some time in development • A good development engineer MUST spend some time in operations
  • 8.
  • 9. Debugging • 9 Rules of debugging • http://www.debuggingrules.com/Poster_ download.html – Yes the font is horrible
  • 10. Rule 1: Understand the system • Complexity Kills • No excuse • If you write it, you must know it • If you run it, you must know it • If you buy it, you must know it
  • 11. Rule 3: Quit thinking and look • quot;It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
  • 12. Rule 3: Quit thinking and look • What do you look at? • The importance of monitoring • Monitoring • Monitoring • Monitoring
  • 13. My my, confusing term • Monitoring • Alerting • Trending
  • 14. Monitoring • Collects data • Puts into databases • Makes it available for you • Active collection • Passive interaction
  • 15. Alerting • Acts on monitoring data • Severe alerts – Active – Needs action • Passive alerts – Things that need to be done but not right now • DO NOT OVER ALERT • DO NOT CRY WOLF
  • 16. Wikia alerting strategy • When the site is slow • Or down • We send emails and do phone calls • Europe and US West coast • Looking to hire in East Asia • No night time
  • 18. Monitor Tools • Nagios • Cacti • MRTG • Hyperic • Cricket • Ganglia
  • 19. External Monitoring • Use one, tells you what your clients see every x minutes • Keynote • Gomez • Websitepulse (cheap - easy - I like them; no annoying salesforce)
  • 20. Nagios • Alerting • Hassle • C CGI?? • Doesn’t scale
  • 21. Hyperic • Most exciting open source tool • Agent base - self configured • Baseline alerting
  • 22. Cricket MRTG Cacti • Impossible to configure • You need to write tools to do it • Especially Cacti – Somewhat more pleasant than clawing out your eyes
  • 23. Ganglia • We love ganglia • Automatically graphs everything you want - just works • Large scale clusters • Multicast • Zero config • RRD
  • 24. http://ganglia.wikimedia.org/ • 270 hosts • 880 CPU • 2 clusters • 1.2 TB of Memory
  • 26. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 27. Custom Ganglia Gmetrics • Or Learn Unix gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 28. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 29. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 30. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 31. Something is wrong • Don’t worry, data warehouse QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.
  • 32. tcpdump / waveshark • If you suspect the network • Don’t just suspect • LOOK AT IT • Tcpdump / waveshark will tell you – If your packets are lost, delayed or corrupted – Your windowing is wrong
  • 33. Rule 4: Divde and Conquer • Look at the problems in turn • Split between people • Go in the order you suspect is the most likely
  • 34. Rule 5: Change one thing at a time • I cannot stress this enough • IF YOU DO NOT THEN YOU HAVE FAILED TO IDENTIFY THE PROBLEM
  • 35. Rule 6: Keep an audit trail • You might be making things worse • Good for the root cause analysis • Have your shell log all commands – Good practice anyway • Version control
  • 36. Rule 9: If you didn’t fix it, it ain’t fixed • You must do something to fix a problem • Or it will bite you again • And again • And again • They don’t just appear and disappear • Except BGP route convergence :)
  • 37. Process • You need a little • Don’t worry
  • 39. Complexity kills • Design against it • Reuse components • Define standards • Have a few images that all machines look like - reimage machines every now and then for the heck of it. – EC2 forces you to do this
  • 40. MTBF Meduim Time Between Failure • Actually mostly irrelevant • Dealing with failure is more important • Target the right uptime – Complexity scales exponatially with required uptime • Don’t kid yourself, you don’t need 5 nines
  • 41. MTTR Medium Time To Recovery • Important • Noone cares if you fail once a minute – If you recover in 50 ms • If you are down 1 minute a week, you are still going to hit 4 nines (99.99%) • Failures happen, plan how to deal with them
  • 42. Problem found • If it is critical, start a phone conversation • Use IRC to communicate technical data • One person liasons with non technical staff • One person specifically in command • Sleep scheduling ( audit log important )
  • 43. Post crisis • Root cause analysis – Just find out what went wrong – And how to avoid it – Or fix it faster next time if you can’t • Keep track of your uptime
  • 44. Automation • All machines are created equal • Seriously • If you manually make changes • You are wrong – Unless you know what you are doing
  • 45. Best practices • Version control • Gold images • Centralised authentication • Time Sync ( NTP ) • Central logging • ( All of this applies for virtual machines too!)
  • 46. cfengine • Standard automation tool • Written in C • Not much support • Very good • Very annoying
  • 47. contro : l s te i = ( mys te ) i domain = ( mysite .count y ) r sysadm = (mark ) netmask = ( 255.255.255.0 ) ac i t onsequence = ( mounta ll mount nfo i addmounts mounta l l lnks i ) mountpat rn = / ie) ( te $(s t /$ host)) homepat r = ( u? ) te n
  • 48. Puppet • New hip kid on the block • Written in ruby • Better support? • Much nicer syntax • Easier to extend
  • 49. def ne yumrepo (enab i led = true) {c i i onf gfle { /e c quot; t /yum.repos / .d $name.repo”: mode => 644, source => quot; yum/repos / /$name. repoquot;, ensure => $enab led ? { true => fl , ie defau t=> absent l } }}
  • 50. cobb er l • Automatic PXE Installer – Uses kickstart files • Redhat Enterprise • Centos • Fedora • Some support for debian
  • 51. cobbler cobbler system add --name=xen8 --mac=00:19:B9:EE:6D:0A --ip=10.10.30.208 --profile=Centos-5-x86_64 --kopts='ksdevice=00:19:B9:EE:6D:0A console=ttyS1,57600 console=tty0'
  • 52. cobbler cobbler system add --name=xen8 --mac=00:19:B9:EE:6D:0A --ip=10.10.30.208 --profile=Centos-5-x86_64 --kopts='ksdevice=00:19:B9:EE:6D:0A console=ttyS1,57600 console=tty0’
  • 53. koan • Client install tool – Xen – Or OS re-image koan --server=10.10.30.205 --virt -- profile=virt_fc6 --virt-name=otrs
  • 54. Your datacenter • Keep it tidy – Label things, keep cables as short as possible – Have a switch in each rack • If you are small without dedicated DC staff you need – Remote control power switches – Remote console!
  • 55. Virtualization • Please use it • Managing becomes much easier • Power consumption • Need a new test box – The requestor can have it in minutes
  • 56. Power consumption • Maybe not as important in Europe • 8 core machines are more efficient than 1 core • But memcache uses 1 core and all RAM • Get more RAM and virtualise
  • 57. Our network admin boxes • 1 Xen CPU for Vyatta • 1 Xen CPU for LVS • 1 Xen CPU for Squid - Carp • 1 Xen CPU for Squid • 1 Xen CPU for Monitoring • 1 Xen CPU for network tasks • We can have more of these and a loss of one affects us less
  • 58. Vyatta • Opensource router – Really like it – No need to use Cisco
  • 59. LVS • Linux Virtual Server • Low level load balancer • HA • Fast • Doesn’t inspire people to put things in the only place that is hard to scale
  • 60. Squid Carp • Squids configured to hash the urls and send them to specific backend • Very little configuration done • Logging of UDP - no disk IO
  • 61. Squid • As a reverse web accelerator • 90 % of our hits served from RAM in less than 1 ms • Same as wikipedia • We only use RAM cache ( unlike wikipedia) • Cached per user • If not cacheable - cache for a second to redue backend effect
  • 62. App servers • 1 xen cpu for memcache ( 5 GB Ram) • 1 xen cpu for squid ( 5GB Ram ) • 6 xen cpus for apache (6 GB Ram ) • More power efficient, less affected by loss • Applications can’t affect each other
  • 63. Databases • Keep developers on short leash • Report bad queries • Fear object relational mappers
  • 64. Outsourcing • As much as possible • The younger you are as a company the less risk – When you have no users, you have no value • VCs don’t like having their money go into Capex
  • 65. What I want from Vendors • They do what they tell me • They do what I tell them • No annoying up sells, no premium services – I know more about what you are selling than you
  • 66. Services we use • Amazon EC2 and S3 • Panther-Express
  • 67. Panther Express • Fantastic Content Distribution Network • Cheap, simple price list – Take note akamai • Cut delivery time to Europe by 70% • We let our images be cached 1 second to redue load
  • 68. EC2 and S3 • We save all our binlogs to S3 • We save database dumps to S3 • We have monitors running from EC2 • We plan to build a datawarehouse cluster on EC2
  • 69. EC2 Requires Automation • Machine is blank when you bring it up • Download database dump from S3 and replicate up - automatically • Use puppet • Amazon saves you hardware headaches – But complexity is still a problem