SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
The Power of Data Orchestration:
Storage Acceleration and Servitization at Shopee
Tianbao Ding
Haoning Sun
Private &
Confidential
Private & Confidential 2
1 Storage Status
2
3
4
Storage Accelerationn
Storage Servitisation
Future Plan
Storage Acceleration and Servitization at Shopee
Storage Servitisation
Storage Servitization
Private & Confidential 3
Storage Status—Architecture
3
Data Management Platform (DMP)
Spark Flink
Yarn
Presto
App (Search,
Recommendation
etc.)
Compute
Engine
Resource
Scheduler
Storage HDFS Ozone
Platform
Product
Private & Confidential 4
Storage Status—HDFS
4
Metric Value
Number of Nodes Thousands
Storage Capacity Hundreds PB
Num of Files Billions
Max QPS Hundreds of thousands
Private & Confidential 5
Storage Status—Presto
5
Metric Value
Number of Nodes Thousands of instances
TP90 About 2 min
Input Dozens of PB per day
Number of Queries
Hundreds of thousands per
day
Private & Confidential 6
HDFS
Unstable
Performance
Presto
Unstable
Query
Q:Can queries run faster?
Storage Status—Presto Accelerate Query
Private & Confidential 7
Presto
HDFS
Presto
Alluxio
Add cache
HDFS
Storage Status—Presto Accelerate Query
Private & Confidential 8 8
Storage Status—Alluxio+Presto Typical Architecture
• Mount HDFS
• Presto visit HDFS via Alluxio
• Alluxio manage cache
Private & Confidential 9 9
Storage Status—Shortcomings
• Need specific caching policies
• Read slowly from alluxio at the first time
Private &
Confidential
Private & Confidential 10
1 Storage Status
2
3
4
Storage Acceleration
Storage Servitisation
Future Plan
Storage Acceleration and Servitisation at Shopee
Storage Servitisation
Storage Servitization
Private & Confidential 11 11
Storage Acceleration—Solution
Private & Confidential 12
Storage Acceleration—Architecture
Kafka HDFS
Audit
HMS
Computing
Application
Operator Hot Table
Cache Manager
Alluxio
HDFS
Load/Unload/m
ount
Path
delete/create
event
Set/Clear
partition
property
Load data
Load Table
Update Policy
Input
Output
Get tag
Private & Confidential 13
Presto Query
Log
Hot Table
(Hive Table)
• Partition by date
• Calculate the number of visits of table
every day
Storage Acceleration—Hot Table
Private & Confidential 14
• Scheduled every day
Hot Table
Most frequently
visited weighted
tables in the last
seven days
Recent m
partitions
every table
Load from
HDFS to
Alluxio
Persist
relationship
Set tag in
HMS
Note: It is an alpha version and the
subsequent iterations will be optimized
continuously
Storage Acceleration—Update Policy
Private & Confidential 15
HDFS
Alluxio
HMS
Presto
On Alluxio No tag
• key:cache,
value:${DC}/Alluxio/ebj@${Alluxio_nameservice}
• If partition exists, set property in partition
property
• Else, set property in table property
Storage Acceleration—HMS Tag
Private & Confidential 16
Example:
Storage Acceleration—HMS Tag
Private & Confidential 17
HDFS
Audit Log
Flink
Format:
• cmd=xxsrc=xxdst=xx
Storage Acceleration—Kafka
Kafka topic
filter
Private & Confidential 18
PATH
mount
unmount
load
query
HIVE TABLE
mount
unmount
load
load all the
recorded paths
load recent n
partitions
query
ADMIN
monitor and operator
Storage Acceleration—REST API
Private & Confidential 19
Storage Acceleration—Perf Effect
Private & Confidential 20
• 6 merged, 2 WIP, 1 fixed by Alluxio.
TYPE PR STATUS
Hadoop 2.10
Fix HdfsVersion miss hadoop 2.10 config merged
Fix integration/yarn/pom.xml enforcer-plugin miss hadoop
2.10.x config
merged
Fix common.go miss hadoop 2.10 configuration merged
Command Line
Improve shell command support ebj nameservice merged
Fix for Alluxio.logs.dir
fixed by
Alluxio
Web Page
Fix isMounted should not invoke ufs, if not /metrics page very
slowly
merged
Fix FormatUtils.getSizeFromBytes method should supports EB merged
NameServices
Fix unescape the ufs url of Alluxio fsadmin report metrics
result
WIP
Metrics Fix cache radio total not include cacheMisses WIP
Storage Acceleration—Community Contribution
Private & Confidential
Private & Confidential 21
1 Storage Situation
2
3
4
Storage Acceleration
Storage Servitization
Future Plan
Storage Acceleration and Servitization at Shopee
Private & Confidential
Private & Confidential 22
Storage Servitization—Status
▪ Most of data is stored in HDFS
▪ Various development languages are used
▪ HDFS has insufficient support for non Java clients
Private & Confidential
Private & Confidential 23
Fuse for HDFS
S3 for HDFS
▪ Deploy alluxio fuse service on physical machine
▪ Deploy alluxio fuse service on kubernetes cluster
▪ Using S3 API to access alluxio proxy service
Storage Servitization—Solutions
Private & Confidential
Private & Confidential 24
▪ Kernel
▪ User-level daemon
High-Level Architecture
Storage Servitization—Fuse
WHAT IS IT
▪ FileSystem in Userspace
Private & Confidential
Private & Confidential 25
▪ libfuse
▪ JNR-Fuse
▪ JNI-Fuse
Requirements
Implementation
Storage Servitization—Alluxio Fuse
▪ Standalone Fuse
▪ Fuse on Workers
Deployment
▪ Not support random writes
Limitations
Private & Confidential
Private & Confidential 26
Store Servitization—Alluxio CSI
▪ On nodeserver pod
▪ On separate pod(new feature)
Fuse Deployment mode
WHAT IS IT
▪ Standard storage interface for
containers
Private & Confidential
Private & Confidential 27
▪ Fuse sidecar container in a Pod to mount the
Alluxio directory
▪ Independent configuration of pods, high flexibility
▪ Each Pod runs a Fuse container without affecting
each other
▪ Each Fuse process occupies a container, so the
solution consumes more resources
Futures
Store Servitization—k8s sidecar for Alluxio
WHAT IS IT
Private & Confidential
Private & Confidential 28
Store Servitization—Summarize
Fuse on
K8s-csi
K8s-sidecar
Fuse on nodeserver
pod
Fuse on separate pod
maintenance
cost
high low higher higher
resource usage low lower high high
independence high low high high
Private & Confidential
Private & Confidential 29
▪ Bucket: A bucket is a container for objects stored in Amazon S3
▪ Object: Objects are the fundamental entities stored in Amazon S3
▪ Key: An object key (or key name) is the unique identifier for an
object within a bucket.
▪ Region: You can choose a region to store the created buckets
Store Servitization—S3
Buckets
Objects
Keys Regions
Amazon
S3
Concepts
Conception
Private & Confidential
Private & Confidential 30
▪ Alluxio can mount HDFS data
▪ Alluxio provides Proxy service
▪ Proxy is compatible with the basic operations of the S3 API
▪ S3 SDK supports many development languages
Store Servitization—S3 for HDFS
Access HDFS data via Alluxio using S3 protocol
Private & Confidential
Private & Confidential 31
▪ 1-level directory as bucket
▪ Subdirectories and file paths as key
Store Servitization—Alluxio Proxy for S3 mapping
Private & Confidential
Private & Confidential 32
Store Servitization—Proxy Authentication
▪ Authentication parser
▪ Validator
▪ Secret Manager
▪ Signature Calculation
Private & Confidential
Private & Confidential 33
Store Servitization—Service Architecture
Private & Confidential
Private & Confidential 34
Store Servitization—Community contribution
TYPE PR STATUS
proxy
Fix wrong format of s3 bucket creationDate merged
Support parse authorization headers for s3 proxy WIP
fuse
Fix wrong method call to get username and wrong
parameter assignment
merged
csi Replace invalid env with args in nodeserver merged
doc
Fix bug case of S3 REST API merged
Fix wrong file name in k8s doc merged
Fix ambiguous description for impersonation in CN
doc
merged
ozone Update ozone from 1.1.0 to 1.2.1 closed
▪ 6 merged, 1 WIP, 1 closed.
Private & Confidential
Private & Confidential 35
1 Storage Situation
2
3
4
Storage Acceleration
Storage Servitization
Future Plan
5 Future Plan
Storage Acceleration and Servitization at Shopee
Private & Confidential
Private & Confidential 36
▪ Speed up Spark and Hive
▪ Implement adaptive cache policy on CacheManager
▪ Support more POSIX APIs
▪ Optimize CSI
Storage Service
Future Plan
Storage speed up
Private & Confidential
Private & Confidential 37
Thank You
Storage Acceleration and Servitization at Shopee

Weitere ähnliche Inhalte

Was ist angesagt?

NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
High Availability for OpenStack
High Availability for OpenStackHigh Availability for OpenStack
High Availability for OpenStackKamesh Pemmaraju
 
Oracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud InfrastructureOracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud InfrastructureSinanPetrusToma
 
GT.M: A Tried and Tested Open-Source NoSQL Database
GT.M: A Tried and Tested Open-Source NoSQL DatabaseGT.M: A Tried and Tested Open-Source NoSQL Database
GT.M: A Tried and Tested Open-Source NoSQL DatabaseRob Tweed
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
OpenShift 4 installation
OpenShift 4 installationOpenShift 4 installation
OpenShift 4 installationRobert Bohne
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Bobby Curtis
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Why oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cWhy oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cSatishbabu Gunukula
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceAnil Nair
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformJean-Paul Azar
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIORebekah Rodriguez
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®confluent
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 

Was ist angesagt? (20)

NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
High Availability for OpenStack
High Availability for OpenStackHigh Availability for OpenStack
High Availability for OpenStack
 
Oracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud InfrastructureOracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud Infrastructure
 
GT.M: A Tried and Tested Open-Source NoSQL Database
GT.M: A Tried and Tested Open-Source NoSQL DatabaseGT.M: A Tried and Tested Open-Source NoSQL Database
GT.M: A Tried and Tested Open-Source NoSQL Database
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
VMware cloud on AWS
VMware cloud on AWSVMware cloud on AWS
VMware cloud on AWS
 
OpenShift 4 installation
OpenShift 4 installationOpenShift 4 installation
OpenShift 4 installation
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Why oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19cWhy oracle data guard new features in oracle 18c, 19c
Why oracle data guard new features in oracle 18c, 19c
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC Performance
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platform
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 

Ähnlich wie The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee

The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...
The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...
The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...Alluxio, Inc.
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudAlluxio, Inc.
 
Red Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed_Hat_Storage
 
Red Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed_Hat_Storage
 
Secure Redis Cluster At Box: Vova Galchenko, Ravitej Sistla
Secure Redis Cluster At Box: Vova Galchenko, Ravitej SistlaSecure Redis Cluster At Box: Vova Galchenko, Ravitej Sistla
Secure Redis Cluster At Box: Vova Galchenko, Ravitej SistlaRedis Labs
 
"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018Globus
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio, Inc.
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld
 
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAccelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAlluxio, Inc.
 
Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudShubham Tagra
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...SpringPeople
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataGreat Wide Open
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 

Ähnlich wie The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee (20)

The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...
The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...
The Power of Data Orchestration: Storage Acceleration and Servitization at Sh...
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
 
Red Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open Stack
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
Red Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS Plans
 
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheConTechnical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
 
Secure Redis Cluster At Box: Vova Galchenko, Ravitej Sistla
Secure Redis Cluster At Box: Vova Galchenko, Ravitej SistlaSecure Redis Cluster At Box: Vova Galchenko, Ravitej Sistla
Secure Redis Cluster At Box: Vova Galchenko, Ravitej Sistla
 
"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018"What's New With Globus" Webinar: Spring 2018
"What's New With Globus" Webinar: Spring 2018
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stackAccelerating analytics in the cloud with the Starburst Presto + Alluxio stack
Accelerating analytics in the cloud with the Starburst Presto + Alluxio stack
 
Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the Cloud
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 

Mehr von Alluxio, Inc.

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioAlluxio, Inc.
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingAlluxio, Inc.
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionAlluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeAlluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudAlluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderAlluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAlluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAlluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAlluxio, Inc.
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio, Inc.
 

Mehr von Alluxio, Inc. (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 

Kürzlich hochgeladen

%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 

Kürzlich hochgeladen (20)

%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee

  • 1. The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee Tianbao Ding Haoning Sun
  • 2. Private & Confidential Private & Confidential 2 1 Storage Status 2 3 4 Storage Accelerationn Storage Servitisation Future Plan Storage Acceleration and Servitization at Shopee Storage Servitisation Storage Servitization
  • 3. Private & Confidential 3 Storage Status—Architecture 3 Data Management Platform (DMP) Spark Flink Yarn Presto App (Search, Recommendation etc.) Compute Engine Resource Scheduler Storage HDFS Ozone Platform Product
  • 4. Private & Confidential 4 Storage Status—HDFS 4 Metric Value Number of Nodes Thousands Storage Capacity Hundreds PB Num of Files Billions Max QPS Hundreds of thousands
  • 5. Private & Confidential 5 Storage Status—Presto 5 Metric Value Number of Nodes Thousands of instances TP90 About 2 min Input Dozens of PB per day Number of Queries Hundreds of thousands per day
  • 6. Private & Confidential 6 HDFS Unstable Performance Presto Unstable Query Q:Can queries run faster? Storage Status—Presto Accelerate Query
  • 7. Private & Confidential 7 Presto HDFS Presto Alluxio Add cache HDFS Storage Status—Presto Accelerate Query
  • 8. Private & Confidential 8 8 Storage Status—Alluxio+Presto Typical Architecture • Mount HDFS • Presto visit HDFS via Alluxio • Alluxio manage cache
  • 9. Private & Confidential 9 9 Storage Status—Shortcomings • Need specific caching policies • Read slowly from alluxio at the first time
  • 10. Private & Confidential Private & Confidential 10 1 Storage Status 2 3 4 Storage Acceleration Storage Servitisation Future Plan Storage Acceleration and Servitisation at Shopee Storage Servitisation Storage Servitization
  • 11. Private & Confidential 11 11 Storage Acceleration—Solution
  • 12. Private & Confidential 12 Storage Acceleration—Architecture Kafka HDFS Audit HMS Computing Application Operator Hot Table Cache Manager Alluxio HDFS Load/Unload/m ount Path delete/create event Set/Clear partition property Load data Load Table Update Policy Input Output Get tag
  • 13. Private & Confidential 13 Presto Query Log Hot Table (Hive Table) • Partition by date • Calculate the number of visits of table every day Storage Acceleration—Hot Table
  • 14. Private & Confidential 14 • Scheduled every day Hot Table Most frequently visited weighted tables in the last seven days Recent m partitions every table Load from HDFS to Alluxio Persist relationship Set tag in HMS Note: It is an alpha version and the subsequent iterations will be optimized continuously Storage Acceleration—Update Policy
  • 15. Private & Confidential 15 HDFS Alluxio HMS Presto On Alluxio No tag • key:cache, value:${DC}/Alluxio/ebj@${Alluxio_nameservice} • If partition exists, set property in partition property • Else, set property in table property Storage Acceleration—HMS Tag
  • 16. Private & Confidential 16 Example: Storage Acceleration—HMS Tag
  • 17. Private & Confidential 17 HDFS Audit Log Flink Format: • cmd=xxsrc=xxdst=xx Storage Acceleration—Kafka Kafka topic filter
  • 18. Private & Confidential 18 PATH mount unmount load query HIVE TABLE mount unmount load load all the recorded paths load recent n partitions query ADMIN monitor and operator Storage Acceleration—REST API
  • 19. Private & Confidential 19 Storage Acceleration—Perf Effect
  • 20. Private & Confidential 20 • 6 merged, 2 WIP, 1 fixed by Alluxio. TYPE PR STATUS Hadoop 2.10 Fix HdfsVersion miss hadoop 2.10 config merged Fix integration/yarn/pom.xml enforcer-plugin miss hadoop 2.10.x config merged Fix common.go miss hadoop 2.10 configuration merged Command Line Improve shell command support ebj nameservice merged Fix for Alluxio.logs.dir fixed by Alluxio Web Page Fix isMounted should not invoke ufs, if not /metrics page very slowly merged Fix FormatUtils.getSizeFromBytes method should supports EB merged NameServices Fix unescape the ufs url of Alluxio fsadmin report metrics result WIP Metrics Fix cache radio total not include cacheMisses WIP Storage Acceleration—Community Contribution
  • 21. Private & Confidential Private & Confidential 21 1 Storage Situation 2 3 4 Storage Acceleration Storage Servitization Future Plan Storage Acceleration and Servitization at Shopee
  • 22. Private & Confidential Private & Confidential 22 Storage Servitization—Status ▪ Most of data is stored in HDFS ▪ Various development languages are used ▪ HDFS has insufficient support for non Java clients
  • 23. Private & Confidential Private & Confidential 23 Fuse for HDFS S3 for HDFS ▪ Deploy alluxio fuse service on physical machine ▪ Deploy alluxio fuse service on kubernetes cluster ▪ Using S3 API to access alluxio proxy service Storage Servitization—Solutions
  • 24. Private & Confidential Private & Confidential 24 ▪ Kernel ▪ User-level daemon High-Level Architecture Storage Servitization—Fuse WHAT IS IT ▪ FileSystem in Userspace
  • 25. Private & Confidential Private & Confidential 25 ▪ libfuse ▪ JNR-Fuse ▪ JNI-Fuse Requirements Implementation Storage Servitization—Alluxio Fuse ▪ Standalone Fuse ▪ Fuse on Workers Deployment ▪ Not support random writes Limitations
  • 26. Private & Confidential Private & Confidential 26 Store Servitization—Alluxio CSI ▪ On nodeserver pod ▪ On separate pod(new feature) Fuse Deployment mode WHAT IS IT ▪ Standard storage interface for containers
  • 27. Private & Confidential Private & Confidential 27 ▪ Fuse sidecar container in a Pod to mount the Alluxio directory ▪ Independent configuration of pods, high flexibility ▪ Each Pod runs a Fuse container without affecting each other ▪ Each Fuse process occupies a container, so the solution consumes more resources Futures Store Servitization—k8s sidecar for Alluxio WHAT IS IT
  • 28. Private & Confidential Private & Confidential 28 Store Servitization—Summarize Fuse on K8s-csi K8s-sidecar Fuse on nodeserver pod Fuse on separate pod maintenance cost high low higher higher resource usage low lower high high independence high low high high
  • 29. Private & Confidential Private & Confidential 29 ▪ Bucket: A bucket is a container for objects stored in Amazon S3 ▪ Object: Objects are the fundamental entities stored in Amazon S3 ▪ Key: An object key (or key name) is the unique identifier for an object within a bucket. ▪ Region: You can choose a region to store the created buckets Store Servitization—S3 Buckets Objects Keys Regions Amazon S3 Concepts Conception
  • 30. Private & Confidential Private & Confidential 30 ▪ Alluxio can mount HDFS data ▪ Alluxio provides Proxy service ▪ Proxy is compatible with the basic operations of the S3 API ▪ S3 SDK supports many development languages Store Servitization—S3 for HDFS Access HDFS data via Alluxio using S3 protocol
  • 31. Private & Confidential Private & Confidential 31 ▪ 1-level directory as bucket ▪ Subdirectories and file paths as key Store Servitization—Alluxio Proxy for S3 mapping
  • 32. Private & Confidential Private & Confidential 32 Store Servitization—Proxy Authentication ▪ Authentication parser ▪ Validator ▪ Secret Manager ▪ Signature Calculation
  • 33. Private & Confidential Private & Confidential 33 Store Servitization—Service Architecture
  • 34. Private & Confidential Private & Confidential 34 Store Servitization—Community contribution TYPE PR STATUS proxy Fix wrong format of s3 bucket creationDate merged Support parse authorization headers for s3 proxy WIP fuse Fix wrong method call to get username and wrong parameter assignment merged csi Replace invalid env with args in nodeserver merged doc Fix bug case of S3 REST API merged Fix wrong file name in k8s doc merged Fix ambiguous description for impersonation in CN doc merged ozone Update ozone from 1.1.0 to 1.2.1 closed ▪ 6 merged, 1 WIP, 1 closed.
  • 35. Private & Confidential Private & Confidential 35 1 Storage Situation 2 3 4 Storage Acceleration Storage Servitization Future Plan 5 Future Plan Storage Acceleration and Servitization at Shopee
  • 36. Private & Confidential Private & Confidential 36 ▪ Speed up Spark and Hive ▪ Implement adaptive cache policy on CacheManager ▪ Support more POSIX APIs ▪ Optimize CSI Storage Service Future Plan Storage speed up
  • 37. Private & Confidential Private & Confidential 37 Thank You Storage Acceleration and Servitization at Shopee