Mixing performance, configurability, density, and security at scale has, historically, been hard with PHP. Early approaches have involved CGIs, suhosin, or multiple Apache instances. Then came PHP-FPM. At Pantheon, we've taken PHP-FPM, integrated it with cgroups, namespaces, and systemd socket activation. We use it to deliver all of our goals at unheard-of densities: thousands and thousands of isolated pools per box.
1. PHP at Density and Scale
...with security and consistent performance
2. About Me
● Four Kitchens
● Drupal.org
● Pressflow
● Pantheon
● systemd
3. Broadly Defining Security
Your data...
1. Is accessible to the right people (access)
2. Isn’t to anyone else (access)
3. Is usable (quality of service)
5. Challenge: PHP-FPM Overhead
● Using a full PHP-FPM instance per stack
○ Isolated opcode cache space
○ Defense-in-depth against PHP issues
○ Low-impact reconfiguration
● Idle PHP-FPMs take ~0.5% of a core each
○ At 10k dense, that’s over six cores
● Initial solution used error capture in nginx
○ Masked real failures to connect to PHP-FPM
○ Slower than necessary
○ Production use of HTTP 418 (arguably a bonus)
6. Traditional server sockets: overview
...
nginx
TCP
80
Client
nginx
TCP
81
If you want a service
available, the daemon
has to be running.
8. Socket activation: details
● systemd squats on all listeners
○ Looks for incoming traffic with EPOLL
○ Starts the services/containers on-demand
○ Passes socket to daemon as fd=3+
● Not a proxy (same performance)
● No client awareness
● No CPU or memory overhead when idle
9. Socket activation: Pantheon’s use
● nginx and PHP-FPM
● MariaDB soon
○ Using an alternative now
● Allows 90%+ containers to be idle
● Makes bootup sensible
● Reconfiguration pattern is stop, not restart
11. Automount/autofs
● Like socket activation for file system mounts
○ Kernel squats on mount path and looks for traffic
○ Brings up file mount lazily
● Used for FuseDAV (Valhalla client)
13. Challenge: Resource Availability
● Per-site load isn’t predictable
● Different sites compete for resources
○ Between customers
○ Among customers’ own sites
● Traditional prioritization isn’t adequate
○ VMs are too heavyweight
○ Tools like “nice” can cause starvation
○ Generally want burstability
14. cgroups
● Many options
○ Pantheon uses CPUShares and BlockIOWeight
● Keeps things fair under contention
○ Kind of like adding purple ropes when people are
queueing
16. Customer Experience Monitor
● Runs a representative Drupal site on every
container host
● Reports scores to the API and monitoring
● Influences migration and container
placement
17. Migration
● At density, rebalancing is important
● Keep state lightweight
○ No OS
○ No runtime
● Mutiny: migration as replication + promotion
18. Challenge: Security Isolation
● Many users
● One kernel
● VMs too heavyweight
● Users run their own code
● Can’t betray expectations
○ Many users develop locally and push code
○ Some customers import existing, working sites
20. Defense in depth
● Application
○ Drupal
● Runtime
○ nginx, PHP-FPM, FuseDAV
● Container: “binding” certificate
○ Linux user, namespaces, etc.
● Container host: “endpoint” certificate
○ Only trusted for the containers assigned
● Platform: root certificate
21. Challenge: Security Responses
● Traditional approach too big a hammer
○ Rebooting hundreds of hosts with 10k+ containers
each would be a fail-over storm
○ Basic customers don’t have fail-over
○ Not going to pack it less dense
● Customers can run own code
○ May load executables and libraries themselves