This document discusses using Alluxio and ZFS together to provide a hybrid collaborative tiered storage solution with Amazon S3. Alluxio acts as a distributed data storage layer that can mount S3 and HDFS, providing data locality. ZFS works at the kernel level to accelerate read/write speeds by caching data in RAM and automatically promoting and demoting blocks between storage tiers like RAM, SSD, and S3. Benchmark results show the combination of ZFS and NVMe SSDs provides up to 10x faster read speeds and 4x faster write speeds compared to using just Amazon EBS, and up to 15x faster performance than directly accessing data from S3. This hybrid approach provides improved performance for analytic queries in
2. Bazaarvoice
● Founded in 2005 in Austin, TX
● Digital marketing SaaS platforms for ratings and reviews
○ Display & syndicate reviews from brands to retailer websites
○ Reporting & analytics on consumers, reviews, products, etc.
● 2,600 client websites
● 5.4 billion product page views each month
● 900 million unique shoppers each month
3. Reporting & analytics on S3
When you have 100s of TB of data on S3
● Just listing the files is slow
● Download speed in EC2 is limited (50-150Mb/s per node)
● No concept of cache
● No concept of data locality
4. AWS S3 : The Need For Speed
● Add tiered storage to S3
○ Hot, warm, cold storage (fastest, fast, and not so fast)
○ Metadata cache
○ Data cache
● Keep data local
○ In the same machine, not via the Ethernet cable
● Compatible with existing services
○ Hadoop, Spark, Hive, Presto, etc.
● Adaptive & highly configurable
○ Symlink for S3
5. ZFS
App1 Spark
Alluxio
S3
Hot & Warm
Cold
Overview
App2
● Alluxio
○ Distributed data
storage
○ Hadoop compatible
○ By AMPLab
● ZFS
○ OS-level file system
○ Volume manager
○ By Sun Microsystems
● Both are open-source
Metastore
6. Alluxio : The tiered-storage layer
● Support for native filesystem and Hadoop filesystem
● Distributed and can be installed on every node
○ Provides data locality
● Mount S3, HDFS, etc. to Alluxio
○ Think symlink. No data movement.
● Use Hive metastore to partition data into hot/warm and cold region
○ Acts as a remote tiered-storage layer
7. ZFS : The acceleration layer
● Both a filesytem & a volume manager
○ Mirror write to 2 SSDs -> 2x read speed
● Works at the Linux kernel-space
○ Works with RAM to accelerate read/write
○ Auto promote/demote blocks from RAM to other storage
○ Used with local NVMe SSD if data is not in RAM
○ Acts as a local tiered-storage layer
● Extremely reliable
○ Automatic block checksum & repair
14. Hive Monitoring & Performance
Scanning 200G of data in
tiered storage, 500M
rows, select *
Scanning 5G of data in
tiered storage, 350M
rows, fewer projections
15. Scanning 35G of data in
S3, 1.6B rows, count
distinct
Metadata/split calculation ops
60s, majority of the
time spent on
scanning S3
16. Result
● 5-10X read improvement in Hive
○ Worker can short-circuit and read directly from ZFS instead of S3
○ Move compute to the data
● Easy to debug, with feedback loop, collaborative
○ Data publishers + data analysts/scientists
● Good for iterating over the same data set multiple times
○ Machine learning
○ Exploratory analysis
● Give us control over S3
○ More recent data should be faster to access