SlideShare a Scribd company logo
1 of 10
Download to read offline
eBay’s Challenges and Lessons
from Growing an eCommerce Platform to Planet Scale



                           Randy Shoup
                    eBay Distinguished Architect




 HPTS 2009
 October 27, 2009
Challenges at Internet Scale


• eBay manages …
   – Over 89 million active users worldwide
   – 190 million items for sale in 50,000 categories
   – Over 8 billion URL requests per day


• … in a dynamic environment
   – Hundreds of new features per quarter
   – Roughly 10% of items are listed or ended every day


• … worldwide
   – In 39 countries and 10 languages
   – 24x7x365




• >70 billion read / write operations / day

                                                          © 2009 eBay Inc.
Architectural Lessons (round 1)


• 1. Partition Everything
   – Functional partitioning for processing (pools) and
     data (hosts)                                         User         Item      Transaction
   – Horizontal partitioning (“shards”) for data

                                                                 Product      Account   Feedback


• 2. Asynchrony Everywhere
   – Event-driven queues and pipelines
     (at-least-once delivery, order-agnostic)
   – Multicast messaging
     (SRM-inspired techniques for reliability)




• 3. Automate Everything
   – Adaptive configuration of components
   – Feedback loops and machine learning
                                                                                               © 2009 eBay Inc.
Architectural Lessons (round 1)


• 4. Remember Everything Fails
   – Extensive telemetry for failure detection
   – Graceful degradation of functionality


• 5. Embrace Inconsistency
   – Consistency is a spectrum
   – Each usecase trades off CAP properties
   – No distributed transactions
   – Minimize inconsistency through state machines
     and careful ordering of operations
   – Eventual consistency through asynchronous
     recovery or reconciliation




                                                     © 2009 eBay Inc.
Lesson 6: Expect (R)evolution


• Change is the Only Constant
   –   New entities and data elements
   –   Constant infrastructure evolution
   –   Regular data repartitioning and service migration
   –   Periodic large-scale architectural revolution

• Design for Extensibility
   – Flexible schemas
        •   Extensible interfaces (attributes, k-v pairs)
        •   Heterogeneous object storage
   – Pluggable processing
        •   Disparate systems communicate via events
        •   Within system, processing pipeline controlled by configuration

• Incremental System Change
   – Decompose every system change into incremental steps
                                                                                  A   A   B   B             B
   – Multiple versions and systems coexist
        •   Every change is a rolling upgrade; transitional states are the norm
        •   Version A -> A|B -> B|A -> Version B
   – Strict forward / backward compatibility for data and interfaces
   – Dual data processing and storage (“dual writes”)                                 A       B
                                                                                                  © 2009 eBay Inc.
Lesson 7: Dependencies Matter


• Minimize and Control Dependencies
   – Service topology constrained by dependencies
       •   Data center moves change latency characteristics (!)
   – Depend only on abstract interface and virtualized endpoint
   – Make QoS parameters (latency, throughput) explicit in SLA

• Consumer Responsibility
   – It is fundamentally the consumer’s responsibility to manage
     unavailability and SLA violations
   – (Un)availability is an inherently Leaky Abstraction
       •   1st Fallacy of Distributed Computing: “The network is reliable”
   – Recovery is typically use-case-specific
       •   Driven by criticality of the operation and the strength of
           dependency
   – Can abstract with standard patterns
       •   Sync or async failover, degraded function, sync or async error

• Monitor Dependencies Ruthlessly
   – Registries provide WISB but only monitoring provides WIRI
   – Invaluable for problem diagnosis and capacity provisioning


                                                                             © 2009 eBay Inc.
Lesson 8: Be Authoritative


• Authoritative Source (“System of Record”)
   – At any given time, every piece of (mission-critical) data has a
     System of Record
   – Authority can be explicitly transferred (failure, migration)
   – Typically transactional database

                                                                       Primary
• Non-authoritative Sources
   – Every other copy is derived / cached / replicated from
     System of Record
       •   Remote disaster replicas
       •   Search engine                                                         Search Grid

       •   Analytics
       •   Secondary keys
   – Relaxed consistency guarantees with respect to System of
     Record
   – Optimized for alternate access paths or QoS properties
   – Perfectly acceptable for most use-cases




                                                                                 © 2009 eBay Inc.
Lesson 9: Never Enough Data


• Collect Everything
   –   eBay processes 50TB of new, incremental data per day
   –   eBay analyzes 50PB of data per day
   –   Every historical item and purchase is online or nearline
   –   Requires large-scale distributed storage

• Example: System Monitoring
   – Failures at scale are difficult to diagnose and near-impossible
     to replicate
        •   Requires granular instrumentation of every operation
   – Stream processing for pattern detection and failure prediction
   – Historical data to identify optimization opportunities and
     inform capacity provisioning

• Example: Recommendations and Ranking
   – Collect user behavior in the clickstream
        •   Collect -> filter -> enrich -> aggregate -> store                                Historical
                                                                                               Data
   – Drive purchase recommendations                                                                            Analysis
                                                                   Clickstream   Site Data
   – Drive models that predict value of page view, module
     impression, pixel allocation
   – Predictions in the long tail require massive data
                                                                                                          © 2009 eBay Inc.
Lesson 10: Custom Infrastructure


• Right Tool for the Right Job
   – Need to maximize utilization of every resource
       •   Data (memory), processing (CPU), clock time (latency), power (!)
   – One size rarely fits all, particularly at scale
   – Compose from orthogonal, commodity components

• Example: Session and Personalization Cache
   – In-memory volatile KVSS on partitioned MySql Memory
     Engine
   – Async replication to partitioned backing store (Oracle)
   – State redistributed on node failure
   – Versioning, optimistic concurrency, and resolver pattern for
     conflicts

• Example: Metric Server
   – In-memory hierarchical lookup structure for static data
   – Shared infrastructure for multiple types of static data,
     partitioned horizontally
   – Index built offline from multiple data sources, updated
     periodically


                                                                              © 2009 eBay Inc.
Questions?


• Randy Shoup, eBay Distinguished Architect (rshoup@ebay.com)




                                                                © 2009 eBay Inc.

More Related Content

What's hot

Incorporating Chargeback In Private Cloud
Incorporating Chargeback In Private CloudIncorporating Chargeback In Private Cloud
Incorporating Chargeback In Private CloudLai Yoong Seng
 
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce ServerInitial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce ServerBilly Newport
 
Virtualisation at Ringo
Virtualisation at RingoVirtualisation at Ringo
Virtualisation at RingoJeremy Brown
 
Understanding IBM i HA Options
Understanding IBM i HA OptionsUnderstanding IBM i HA Options
Understanding IBM i HA OptionsPrecisely
 
IBM flash systems
IBM flash systems IBM flash systems
IBM flash systems Solv AS
 
XIV Storage deck final
XIV Storage deck finalXIV Storage deck final
XIV Storage deck finalJoe Krotz
 
IBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application DevelopmentIBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application DevelopmentAndrew Coleman
 
Data Kinetics Products
Data Kinetics ProductsData Kinetics Products
Data Kinetics Productssheena82
 
Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...
Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...
Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...Mike Nelson
 
Storage virtualization on storage devices
Storage virtualization on storage devicesStorage virtualization on storage devices
Storage virtualization on storage devicesShubham_Indrawat
 
informix Embeddability and Autonomics
informix Embeddability and Autonomicsinformix Embeddability and Autonomics
informix Embeddability and AutonomicsJohn Miller
 
VMworld 2013: Next Generation Branch Office Designs
VMworld 2013: Next Generation Branch Office Designs VMworld 2013: Next Generation Branch Office Designs
VMworld 2013: Next Generation Branch Office Designs VMworld
 
Oracle hard and soft parsing
Oracle hard and soft parsingOracle hard and soft parsing
Oracle hard and soft parsingIshaan Guliani
 
FlashSystems 2016 update
FlashSystems 2016 updateFlashSystems 2016 update
FlashSystems 2016 updateJoe Krotz
 
Q2 Briefing Presentation
Q2 Briefing PresentationQ2 Briefing Presentation
Q2 Briefing PresentationKurt Carlsen
 
Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3lisanl
 

What's hot (20)

Incorporating Chargeback In Private Cloud
Incorporating Chargeback In Private CloudIncorporating Chargeback In Private Cloud
Incorporating Chargeback In Private Cloud
 
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce ServerInitial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
Initial deck on WebSphere eXtreme Scale with WebSphere Commerce Server
 
Virtualisation at Ringo
Virtualisation at RingoVirtualisation at Ringo
Virtualisation at Ringo
 
Understanding IBM i HA Options
Understanding IBM i HA OptionsUnderstanding IBM i HA Options
Understanding IBM i HA Options
 
IBM flash systems
IBM flash systems IBM flash systems
IBM flash systems
 
KARPAGAM
KARPAGAMKARPAGAM
KARPAGAM
 
RESUME.DOC
RESUME.DOCRESUME.DOC
RESUME.DOC
 
XIV Storage deck final
XIV Storage deck finalXIV Storage deck final
XIV Storage deck final
 
IBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application DevelopmentIBM InterConnect 2015 - IIB Effective Application Development
IBM InterConnect 2015 - IIB Effective Application Development
 
Data Kinetics Products
Data Kinetics ProductsData Kinetics Products
Data Kinetics Products
 
Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...
Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...
Improving The Economics of Mainframe SOA Enablement: Exploiting zIIP/zAAP Spe...
 
Storage virtualization on storage devices
Storage virtualization on storage devicesStorage virtualization on storage devices
Storage virtualization on storage devices
 
informix Embeddability and Autonomics
informix Embeddability and Autonomicsinformix Embeddability and Autonomics
informix Embeddability and Autonomics
 
VMworld 2013: Next Generation Branch Office Designs
VMworld 2013: Next Generation Branch Office Designs VMworld 2013: Next Generation Branch Office Designs
VMworld 2013: Next Generation Branch Office Designs
 
Oracle hard and soft parsing
Oracle hard and soft parsingOracle hard and soft parsing
Oracle hard and soft parsing
 
FlashSystems 2016 update
FlashSystems 2016 updateFlashSystems 2016 update
FlashSystems 2016 update
 
Scalability Design Principles - Internal Session
Scalability Design Principles - Internal SessionScalability Design Principles - Internal Session
Scalability Design Principles - Internal Session
 
Q2 Briefing Presentation
Q2 Briefing PresentationQ2 Briefing Presentation
Q2 Briefing Presentation
 
Lecture 9 further permissions
Lecture 9   further permissionsLecture 9   further permissions
Lecture 9 further permissions
 
Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3
 

Similar to eBay’s Challenges and Lessons

Ebay架构原则
Ebay架构原则Ebay架构原则
Ebay架构原则yiditushe
 
E Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling WebsitesE Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling WebsitesGeorge Ang
 
Qcon best practices for scaling websites
Qcon best practices for scaling websitesQcon best practices for scaling websites
Qcon best practices for scaling websitesyouzitang
 
10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public CloudIntuit Inc.
 
Network Sage™ Into To C Level V1.4
Network Sage™ Into To C Level V1.4Network Sage™ Into To C Level V1.4
Network Sage™ Into To C Level V1.4ikirmer
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabitsYves Goeleven
 
Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...
Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...
Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...VMEngine
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatQiming Teng
 
TECHunplugged Austin 2016
TECHunplugged Austin 2016TECHunplugged Austin 2016
TECHunplugged Austin 2016Chris Evans
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?Deepak Shankar
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Tony Pearson
 
Couchbase b jmeetup
Couchbase b jmeetupCouchbase b jmeetup
Couchbase b jmeetupmysqlops
 
Best Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web SitesBest Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web SitesCraig Dickson
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridScaleOut Software
 

Similar to eBay’s Challenges and Lessons (20)

Ebay架构原则
Ebay架构原则Ebay架构原则
Ebay架构原则
 
E Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling WebsitesE Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling Websites
 
Qcon best practices for scaling websites
Qcon best practices for scaling websitesQcon best practices for scaling websites
Qcon best practices for scaling websites
 
10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud10 Tips for Your Journey to the Public Cloud
10 Tips for Your Journey to the Public Cloud
 
Network Sage™ Into To C Level V1.4
Network Sage™ Into To C Level V1.4Network Sage™ Into To C Level V1.4
Network Sage™ Into To C Level V1.4
 
Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabits
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...
Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...
Database nel cloud: una alternativa ai fogli di calcolo per raccogliere, gest...
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and Heat
 
TECHunplugged Austin 2016
TECHunplugged Austin 2016TECHunplugged Austin 2016
TECHunplugged Austin 2016
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
 
Couchbase b jmeetup
Couchbase b jmeetupCouchbase b jmeetup
Couchbase b jmeetup
 
Best Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web SitesBest Practices for Large-Scale Web Sites
Best Practices for Large-Scale Web Sites
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data Grid
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Recently uploaded (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

eBay’s Challenges and Lessons

  • 1. eBay’s Challenges and Lessons from Growing an eCommerce Platform to Planet Scale Randy Shoup eBay Distinguished Architect HPTS 2009 October 27, 2009
  • 2. Challenges at Internet Scale • eBay manages … – Over 89 million active users worldwide – 190 million items for sale in 50,000 categories – Over 8 billion URL requests per day • … in a dynamic environment – Hundreds of new features per quarter – Roughly 10% of items are listed or ended every day • … worldwide – In 39 countries and 10 languages – 24x7x365 • >70 billion read / write operations / day © 2009 eBay Inc.
  • 3. Architectural Lessons (round 1) • 1. Partition Everything – Functional partitioning for processing (pools) and data (hosts) User Item Transaction – Horizontal partitioning (“shards”) for data Product Account Feedback • 2. Asynchrony Everywhere – Event-driven queues and pipelines (at-least-once delivery, order-agnostic) – Multicast messaging (SRM-inspired techniques for reliability) • 3. Automate Everything – Adaptive configuration of components – Feedback loops and machine learning © 2009 eBay Inc.
  • 4. Architectural Lessons (round 1) • 4. Remember Everything Fails – Extensive telemetry for failure detection – Graceful degradation of functionality • 5. Embrace Inconsistency – Consistency is a spectrum – Each usecase trades off CAP properties – No distributed transactions – Minimize inconsistency through state machines and careful ordering of operations – Eventual consistency through asynchronous recovery or reconciliation © 2009 eBay Inc.
  • 5. Lesson 6: Expect (R)evolution • Change is the Only Constant – New entities and data elements – Constant infrastructure evolution – Regular data repartitioning and service migration – Periodic large-scale architectural revolution • Design for Extensibility – Flexible schemas • Extensible interfaces (attributes, k-v pairs) • Heterogeneous object storage – Pluggable processing • Disparate systems communicate via events • Within system, processing pipeline controlled by configuration • Incremental System Change – Decompose every system change into incremental steps A A B B B – Multiple versions and systems coexist • Every change is a rolling upgrade; transitional states are the norm • Version A -> A|B -> B|A -> Version B – Strict forward / backward compatibility for data and interfaces – Dual data processing and storage (“dual writes”) A B © 2009 eBay Inc.
  • 6. Lesson 7: Dependencies Matter • Minimize and Control Dependencies – Service topology constrained by dependencies • Data center moves change latency characteristics (!) – Depend only on abstract interface and virtualized endpoint – Make QoS parameters (latency, throughput) explicit in SLA • Consumer Responsibility – It is fundamentally the consumer’s responsibility to manage unavailability and SLA violations – (Un)availability is an inherently Leaky Abstraction • 1st Fallacy of Distributed Computing: “The network is reliable” – Recovery is typically use-case-specific • Driven by criticality of the operation and the strength of dependency – Can abstract with standard patterns • Sync or async failover, degraded function, sync or async error • Monitor Dependencies Ruthlessly – Registries provide WISB but only monitoring provides WIRI – Invaluable for problem diagnosis and capacity provisioning © 2009 eBay Inc.
  • 7. Lesson 8: Be Authoritative • Authoritative Source (“System of Record”) – At any given time, every piece of (mission-critical) data has a System of Record – Authority can be explicitly transferred (failure, migration) – Typically transactional database Primary • Non-authoritative Sources – Every other copy is derived / cached / replicated from System of Record • Remote disaster replicas • Search engine Search Grid • Analytics • Secondary keys – Relaxed consistency guarantees with respect to System of Record – Optimized for alternate access paths or QoS properties – Perfectly acceptable for most use-cases © 2009 eBay Inc.
  • 8. Lesson 9: Never Enough Data • Collect Everything – eBay processes 50TB of new, incremental data per day – eBay analyzes 50PB of data per day – Every historical item and purchase is online or nearline – Requires large-scale distributed storage • Example: System Monitoring – Failures at scale are difficult to diagnose and near-impossible to replicate • Requires granular instrumentation of every operation – Stream processing for pattern detection and failure prediction – Historical data to identify optimization opportunities and inform capacity provisioning • Example: Recommendations and Ranking – Collect user behavior in the clickstream • Collect -> filter -> enrich -> aggregate -> store Historical Data – Drive purchase recommendations Analysis Clickstream Site Data – Drive models that predict value of page view, module impression, pixel allocation – Predictions in the long tail require massive data © 2009 eBay Inc.
  • 9. Lesson 10: Custom Infrastructure • Right Tool for the Right Job – Need to maximize utilization of every resource • Data (memory), processing (CPU), clock time (latency), power (!) – One size rarely fits all, particularly at scale – Compose from orthogonal, commodity components • Example: Session and Personalization Cache – In-memory volatile KVSS on partitioned MySql Memory Engine – Async replication to partitioned backing store (Oracle) – State redistributed on node failure – Versioning, optimistic concurrency, and resolver pattern for conflicts • Example: Metric Server – In-memory hierarchical lookup structure for static data – Shared infrastructure for multiple types of static data, partitioned horizontally – Index built offline from multiple data sources, updated periodically © 2009 eBay Inc.
  • 10. Questions? • Randy Shoup, eBay Distinguished Architect (rshoup@ebay.com) © 2009 eBay Inc.