This document summarizes Presto, an analytics engine used at Facebook. It provides ad-hoc querying for data warehouses and batch processing. It is used for analytics across Facebook's data warehouses and specialized data stores. The document outlines Presto's architecture, deployment, usage statistics, features, and enhancements made for specific Facebook use cases including user-facing products, large datasets, and reliable data loading.
2. Presto @ Facebook
• Ad-hoc/interactive queries for warehouse
• Batch processing for warehouse
• Analytics for user-facing products
• Analytics over various specialized stores
6. Stats
• 1000s of internal daily active users
• Millions of queries each month
• Scan PBs of data every day
• Process trillions of rows every day
• 10s of concurrent queries
15. Requirements
• Large data sets
• Seconds to minutes latency
• Predictable performance
• 5-15 minute load latency
• Reliable data loads (no duplicates, no missing data)
• 10s of concurrent queries
18. Additional Features
• Full featured and atomic DDL
• Table statistics
• Tiered storage
• Atomic data loads
• Physical organization
19. Table Statistics
• Table is divided into shards
• Each shard is stored in a separate replication unit (i.e., file)
• Typically 1 to 10 million rows
• Node assignment and stats stored in MySQL
20. Table Schema in MySQL
Tables
id name
1 orders
2 line_items
3 parts
table1 shards
uuid nodes c1_min c1_max c2_min c2_max c3_min c3_max
43a5 A 30 90 cat dog 2014 2014
6701 C 34 45 apple banana 2005 2015
9c0f A,D 25 26 cheese cracker 1982 1994
df31 B 23 71 tiger zebra 1999 2006
22. Tiered Storage
• One copy in local, expensive, flash
• Backup copy in cheap durable backup tier
• Currently Gluster internally, but can be anything durable
• Only assumes GET and PUT with client assigned ID methods
23. Atomic Data Loads
• Import data periodically from streaming event system
• Internally a Scribe based system similar to Kafka or Kinesis
• Provides continuation tokens
• Loads performed using SQL
24. Atomic Data Loads
INSERT INTO target
SELECT *
FROM source_stream
WHERE token BETWEEN ${last_token} AND ${next_token}
25. Loader Process
1. Record new job with “now” token in MySQL
2. Execute INSERT from last committed token to “now” token with
external batch id
3. Wait for INSERT to commit (check external batch status)
4. Record job complete
5. Repeat
26. Failure Recovery
• Loader crash
• Check status of jobs using external batch id
• INSERT hang
• Cancel query and rollback job (verify status to avoid race)
• Duplicate loader processes
• Process guarantees only one job can complete
• Monitor for lack of progress (catches no loaders also)
30. Background Organization
• Compaction
• Balance data
• Eager data recover (from backup)
• Garbage collection
• Junk created by compaction, delete, balance, recovery
31. Future Use Cases
• Hot data cache for Hadoop data
• 0-N local copies of “backup” tier
• Query results cache
• Raw, not rolled-up, data store for Sharded MySql customers
• Materialized view store