2. Agenda
⢠What makes Yahoo! Yahoo!?
⢠Hardware Infrastructure
⢠Software Infrastructure
⢠Operational Infrastructure
⢠Process
⢠Examples
2
3. What makes Yahoo! Yahoo!?
⢠What do these sites have in common?
â Del.icio.us
â Flickr
â Yahoo! Groups
â Yahoo! Mail
â Bix
3
4. What makes Yahoo! Yahoo!?
⢠Accountability at the property level
â Architecture
â Application Operations
â Infrastructure Decisions
⢠Incubator Environment
â Properties function independently on a
common hardware platform
â Highly cost-conscious
â Open-source attitude
4
5. What makes Yahoo! Yahoo!?
⢠Standards at the infrastructure level
â Hardware/Software platform
â Configuration Management
â Operational tools and best practices
⢠Executive Involvement
â Cost
â Robustness
â Redundancy
5
7. Hardware Infrastructure
⢠Shared Components
â Network, Data Center, NAS
â Centrally managed by infrastructure
team
⢠Load Balancing
â DSR is preferred model
â Proxy load balancing only where
necessary
7
8. Hardware Infrastructure
⢠Hardware (x86, RAID/SCSI)
â Jointly managed by properties and ops
⢠Hardware Selection
â Price/Performance is a constant
consideration
â Supply chain and provisioning cost
â Reliability vs. Price
⢠Single-Homed hosts (even databases)
⢠Pooling across multiple switches
⢠Fast Failover to mitigate risk of switch failure
8
9. Hardware Infrastructure Example
⢠Layered Infrastructure
The Internet
⢠Hosts distributed
across multiple racks Router
for power/network Load
redundancy at the
Balancer
pool level Aggregation
Switch
⢠Really Big Load Rack
Switches
Balancers doing DSR
Hosts
9
11. Software Infrastructure
⢠OS (FreeBSD, moving to RHEL)
⢠Databases (MySQL, Oracle)
⢠Development Platforms
â PHP (most properties)
â C/C++ (primary infrastructure platform)
â Java
â Python
11
12. Software Infrastructure
⢠Installable components
â Managed through yinst package
manager
â Stored on common distribution server
â Examples: yapache, yts, yfor, ymon,
yiv, vespa
12
13. Software Infrastructure
⢠More about yinst
â Robust Package Manager
⢠Installation, Versioning, Scripting
â Implementation
⢠Software installed on distribution cluster (package
repository)
⢠Hosts then pull software (via proxies)
⢠Software stored under a common root
⢠Used for everything from perl modules to
common components to applications
13
15. Software Infrastructure - Bix
The Internet
Akamai Akamai
(CDN) (CDN)
YTS YTS
(Reverse Proxy) (Reverse Proxy)
Primary Colo
Backup Colo
Yfor Yfor
(failover resolver) (failover resolver)
Web Servers
Web Servers Web Servers
Web Servers Web Servers
Web Servers
Media Servers Media Servers
Media Servers Media Servers
Media Servers Media Servers
⢠Global Server Load Balancing between sites
⢠YTS provides Reverse Proxy and Connection Management
⢠Yfor provides fast failover from colo to colo
⢠Media is served via a content delivery network for performance
and to reduce load on servers
15
16. Software Infrastructure - Bix
⢠Yfor Failover Resolver used for fast failover of
database connections
⢠Dual Master MySQL setup for write hosts
⢠Media storage on NetApp NAS device, with snap-
mirroring to backup data center
16
17. Software Infrastructure - Bix
⢠Yapache reverse proxy in Web Server
front of Tomcat instance
Yapache (Yahoo-improved Apache)
⢠PHP used to access Mod_proxy
PHP
Yahoo shared services (Reverse Proxy)
Yahoo
Shared
Services
Static
⢠Static files served from Files
disk
⢠Fairly standard Java Tomcat
environment (Spring,
Bix Application
Spring Hibernate
Lucene ehcache
Hibernate, ehcache, c3po, Hessian c3po
log4j MySQL Connector
log4j, etc.)
17
18. Software Infrastructure - Groups
⢠Inbound Groups mail hits a
qmail cluster
⢠Mail filtered against real-time
blacklist
⢠Mail forwarded to second
qmail cluster
⢠Proprietary anti-spam
algorithms applied
⢠Mail forwarded to group
members
⢠Mail stored on archive servers
⢠Oracle RAC clusters store
metadata
⢠Periodic âElectric Potatoâ
measures QoS
18
19. Software Infrastructure â Groups
⢠Dynamic content served via web pool running python/c++ application
⢠CSS and images served via a squid-fronted pool
⢠Group photos on Y! photos infrastructure backed by Yahoo! Media DB
(YMDB)
⢠Database feature implemented as sleepycat DB hosted on message
store
⢠Calendar feature implemented via API calls to calendar.yahoo.com
19
21. Operational Infrastructure
⢠Common Monitoring Infrastructure
â Nagios
⢠Main monitor for clusters
⢠Numerous standard plugins
⢠Standards/Best Practices around custom plugins
â Ywatch
⢠Basic monitoring of machines over SNMP
⢠Heartbeats plus fundamental metrics (IO, CPU, Disk, etc.)
â Ymon
⢠NRPE/NSCA on steroids
⢠Automated forwarding of active and passive checks
⢠Scripted setup
â Drraw
⢠Data Visualization
⢠Deep integration with Nagios and ymon
21
22. Operational Infrastructure
â Rollup Monitoring
⢠Clusters rolled up to centralized monitoring console
⢠Prioritization and correlation of events
â Internal Site QOS Monitoring
⢠QOS monitoring for sites
⢠Response time and availability
â âThe OCâ
⢠24x7, worldwide operations center
⢠Provides tier 1 and 2 support
â Centralized CMDB
⢠Configuration Management DB â manages every device
⢠Contact info, escalations, and runbooks included
22
23. Operational Infrastructure Example
⢠Application Servers
Media
perform checks which Media Web
Web Servers DB Servers
Media Web
Web Servers
Servers Web Servers
Servers Servers
Servers Servers
are registered by
Nagios as passive
checks ymon server
SNMP-Based Checks
⢠Metrics are ymond
aggregated by metrics
metrics module aggregator
yapache nagios
⢠On-demand graphing grapher (rrdtool/drraw)
is provided by drraw forwarding agent
⢠Nagios alerts are
forwarded to central
ywatch console
ywatch server
listener
yapache Ops Console
23
25. Process and Standards
⢠Hardware Review Committee
â Strong emphasis on economics
â Personal attention from David Filo
⢠Software Review Committee
â Thinking through major licensing decisions
⢠Business Continuity Planning
â Required of all properties
â Must have and test backup data center
⢠Paranoids
â Ongoing site scans
â Enforcement of standards
25