2. What is Spark?
• Master: Driver program
• Workers: Executors
• High Availability
• Standby Masters with
ZooKeeper
• Single-Node Recovery with
Local File System
3. Under the hood
• Resilient Distributed Dataset
(RDD)
• Scala + Akka Framework
• Java, Scala, Python API
• Spark SQL, MLib, Spark
Streaming, GraphX