SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
The Power of Data Orchestration:
Storage Acceleration and Servitization at Shopee
Private &
Confidential 2
1 Storage Status
2
3
Storage Accelerationn
Storage Servitisation
Storage Acceleration and Servitization at Shopee
Storage Servitisation
Storage Servitization
Storage Status—Architecture
3
Data Management Platform (DMP)
Spark Flink
Yarn
Presto
App (Search,
Recommendation
etc.)
Compute
Engine
Resource
Scheduler
Storage HDFS Ozone
Platform
Product
Storage Status—HDFS & Presto
4
HDFS
Unstable
Performance
Presto
Unstable
Query
Q:Can queries run faster?
Storage Status—Presto Accelerate Query
Presto
HDFS
Presto
Alluxio
Add cache
HDFS
Storage Status—Presto Accelerate Query
7
Storage Status—Alluxio+Presto Typical Architecture
• Mount HDFS
• Presto visit HDFS via Alluxio
• Alluxio manage cache
8
Storage Status—Shortcomings
• Need specific caching policies
Private &
Confidential 9
1 Storage Status
2
3
Storage Acceleration
Storage Servitisation
Storage Acceleration and Servitisation at Shopee
Storage Servitisation
Storage Servitization
10
Storage Acceleration—Solution
Storage Acceleration—Architecture
Kafka HDFS
Audit
HMS
Computing
Application
Operator Hot Partitions
Cache Manager
Alluxio
HDFS
Load/Unload/m
ount
Path
rename,delete,
create,append,
truncate,conca
t etc.
Set/Clear
partition
property
Load data
Load Partitions
Update Policy
Input
Output
Get tag
• Presto call load/free API
• Cache Manager load a new partition according to the path pattern
• A scheduled task will clean up expired partitions
Storage Acceleration—Update Policy
HDFS audit log
new partition event ,i.e.:
/a/b/date=2022-09-21(will match)
/a/c/date=2022-09-21(will not match)
existed partitions(.../date=2022-09-20)
existed partitions(.../date=2022-09-18)
existed partitions(.../date=2022-09-19)
path
match
Path pattern:
/a/b/date={}
Load to
Alluxio
Not load to
Alluxio
match not match
HDFS
Alluxio
HMS
Presto
On Alluxio No tag
• key:cache,
value:${DC}/Alluxio/ebj@${Alluxio_nameservice}
• If partition exists, set property in partition
property
• Else, set property in table property
Storage Acceleration—HMS Tag
Example:
Storage Acceleration—HMS Tag
● Subscribe HDFS audit log event
1. Include rename,delete,create,append,truncate,concat etc events
2. Use Flink filtering to reduce the messages
● A Scheduled task to check and repair consistency
Storage Acceleration—Consistency
HDFS
Audit Log
Cache
Manager
A new
Kafka
topic
Flink
Storage Acceleration—Perf Effect
• 8 merged, 1 WIP, 1 fixed by Alluxio.
TYPE PR STATUS
core Fix master down when master change to leader merged
Hadoop 2.10
Fix HdfsVersion miss hadoop 2.10 config merged
Fix integration/yarn/pom.xml enforcer-plugin miss hadoop
2.10.x config
merged
Fix common.go miss hadoop 2.10 configuration merged
Command Line
Improve shell command support ebj nameservice merged
Fix for Alluxio.logs.dir
fixed by
Alluxio
Modify the meaning of variables more clearly merged
Web Page
Fix isMounted should not invoke ufs, if not /metrics page
very slowly
merged
Fix FormatUtils.getSizeFromBytes method should supports EB merged
NameServices
Fix unescape the ufs url of Alluxio fsadmin report metrics
result
WIP
Storage Acceleration—Community Contribution
Private &
Confidential 18
1 Storage Situation
2
3
Storage Acceleration
Storage Servitization
Storage Acceleration and Servitization at Shopee
Private &
Confidential 19
Storage Servitization—Status
▪ Most of data is stored in HDFS
▪ Various development languages are used
▪ HDFS has insufficient support for non Java clients
▪ Many applications need to access data as a service, not like a hard disk
Private &
Confidential 20
Fuse for HDFS
S3 for HDFS
▪ Alluxio fuse service on physical machine
▪ Alluxio fuse service on kubernetes cluster
▪ Using S3 API to access HDFS by alluxio proxy service
Storage Servitization—Solutions
Private &
Confidential 21
▪ Bucket: A bucket is a container for objects stored in Amazon S3
▪ Object: Objects are the fundamental entities stored in Amazon S3
▪ Key: An object key (or key name) is the unique identifier for an
object within a bucket.
▪ Region: You can choose a region to store the created buckets
Store Servitization—S3
Buckets
Objects
Keys Regions
Amazon
S3
Concepts
Conception
Private &
Confidential 22
▪ Alluxio can mount HDFS data
▪ Alluxio provides Proxy service
▪ Proxy is compatible with the basic operations of the S3 API
▪ S3 SDK supports many development languages
Store Servitization—S3 for HDFS
Access HDFS data via Alluxio using S3 protocol
Private &
Confidential 23
▪ 1-level directory as bucket
▪ Subdirectories and file paths as key
Store Servitization—Alluxio Proxy for S3 mapping
Private &
Confidential 24
Store Servitization—Proxy Authentication
▪ Authentication parser
▪ Validator
▪ Secret Manager
▪ Signature Calculation
Private &
Confidential 25
Store Servitization—Service Architecture
Private &
Confidential 26
Store Servitization—Community contribution
TYPE PR STATUS
proxy
Fix wrong format of s3 bucket creationDate merged
Support parse authorization headers for s3 proxy merged
Add s3 rest service audit log merged
Add header parameter 'Authorization' for postBucket method merged
fuse
Fix wrong method call to get username and wrong parameter assignment merged
Load jnr-runtime dependencies at initialization merged
Support overwrite for rename merged
csi Replace invalid env with args in nodeserver merged
core Avoid checking file permissions in getFileInfo method merged
doc
Fix bug case of S3 REST API merged
Fix wrong file name in k8s doc merged
Fix ambiguous description for impersonation in CN doc merged
▪ 12 merged.
Private &
Confidential 27
Thank You
Storage Acceleration and Servitization at Shopee
Private &
Confidential 28
▪ Kernel
▪ User-level daemon
High-Level Architecture
Storage Servitization—Fuse
WHAT IS IT
▪ FileSystem in Userspace
Private &
Confidential 29
▪ libfuse
▪ JNR-Fuse
▪ JNI-Fuse
Requirements
Implementation
Storage Servitization—Alluxio Fuse
▪ Standalone Fuse
▪ Fuse on Workers
Deployment
▪ Not support random writes
Limitations
Private &
Confidential 30
Store Servitization—Alluxio CSI
▪ On nodeserver pod
▪ On separate pod
Fuse Deployment mode
WHAT IS IT
▪ Standard storage interface for
containers
Private &
Confidential 31
▪ Fuse sidecar container in a Pod to mount the
Alluxio directory
▪ Independent configuration of pods, high flexibility
▪ Each Pod runs a Fuse container without affecting
each other
▪ Each Fuse process occupies a container, so the
solution consumes more resources
Futures
Store Servitization—k8s sidecar for Alluxio
WHAT IS IT
Private &
Confidential 32
Store Servitization—Summarize
Fuse on physical
machine
K8s-csi
K8s-sidecar
Fuse on
nodeserver pod
Fuse on separate pod
maintenance
cost
high low higher higher
resource
usage
low lower high high
independence high low high high
stability high low high high

Weitere ähnliche Inhalte

Ähnlich wie The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee

How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationshadooparchbook
 
From limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiencyFrom limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiencyAlluxio, Inc.
 
Data Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudData Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudAlluxio, Inc.
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio, Inc.
 
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialStrata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialhadooparchbook
 
vFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deckvFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deckJunchi Zhang
 
Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudShubham Tagra
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
PPWT2019 - EmPower your BI architecture
PPWT2019 - EmPower your BI architecturePPWT2019 - EmPower your BI architecture
PPWT2019 - EmPower your BI architectureRiccardo Perico
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
 
Red Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed_Hat_Storage
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
Backup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipesBackup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipesLeandro Totino Pereira
 

Ähnlich wie The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee (20)

How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
From limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiencyFrom limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiency
 
Data Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudData Orchestration Platform for the Cloud
Data Orchestration Platform for the Cloud
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheConTechnical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
Technical tips for secure Apache Hadoop cluster #ApacheConAsia #ApacheCon
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorialStrata NY 2014 - Architectural considerations for Hadoop applications tutorial
Strata NY 2014 - Architectural considerations for Hadoop applications tutorial
 
vFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deckvFabric Data Director 2.7 customer deck
vFabric Data Director 2.7 customer deck
 
Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the Cloud
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
PPWT2019 - EmPower your BI architecture
PPWT2019 - EmPower your BI architecturePPWT2019 - EmPower your BI architecture
PPWT2019 - EmPower your BI architecture
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
Red Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open Stack
 
Day1_Data Lake_v2.pdf
Day1_Data Lake_v2.pdfDay1_Data Lake_v2.pdf
Day1_Data Lake_v2.pdf
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Backup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipesBackup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipes
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 

Mehr von Alluxio, Inc.

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioAlluxio, Inc.
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingAlluxio, Inc.
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionAlluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeAlluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudAlluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderAlluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAlluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAlluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAlluxio, Inc.
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio, Inc.
 

Mehr von Alluxio, Inc. (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 

Kürzlich hochgeladen

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 

Kürzlich hochgeladen (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 

The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee

  • 1. The Power of Data Orchestration: Storage Acceleration and Servitization at Shopee
  • 2. Private & Confidential 2 1 Storage Status 2 3 Storage Accelerationn Storage Servitisation Storage Acceleration and Servitization at Shopee Storage Servitisation Storage Servitization
  • 3. Storage Status—Architecture 3 Data Management Platform (DMP) Spark Flink Yarn Presto App (Search, Recommendation etc.) Compute Engine Resource Scheduler Storage HDFS Ozone Platform Product
  • 5. HDFS Unstable Performance Presto Unstable Query Q:Can queries run faster? Storage Status—Presto Accelerate Query
  • 7. 7 Storage Status—Alluxio+Presto Typical Architecture • Mount HDFS • Presto visit HDFS via Alluxio • Alluxio manage cache
  • 8. 8 Storage Status—Shortcomings • Need specific caching policies
  • 9. Private & Confidential 9 1 Storage Status 2 3 Storage Acceleration Storage Servitisation Storage Acceleration and Servitisation at Shopee Storage Servitisation Storage Servitization
  • 11. Storage Acceleration—Architecture Kafka HDFS Audit HMS Computing Application Operator Hot Partitions Cache Manager Alluxio HDFS Load/Unload/m ount Path rename,delete, create,append, truncate,conca t etc. Set/Clear partition property Load data Load Partitions Update Policy Input Output Get tag
  • 12. • Presto call load/free API • Cache Manager load a new partition according to the path pattern • A scheduled task will clean up expired partitions Storage Acceleration—Update Policy HDFS audit log new partition event ,i.e.: /a/b/date=2022-09-21(will match) /a/c/date=2022-09-21(will not match) existed partitions(.../date=2022-09-20) existed partitions(.../date=2022-09-18) existed partitions(.../date=2022-09-19) path match Path pattern: /a/b/date={} Load to Alluxio Not load to Alluxio match not match
  • 13. HDFS Alluxio HMS Presto On Alluxio No tag • key:cache, value:${DC}/Alluxio/ebj@${Alluxio_nameservice} • If partition exists, set property in partition property • Else, set property in table property Storage Acceleration—HMS Tag
  • 15. ● Subscribe HDFS audit log event 1. Include rename,delete,create,append,truncate,concat etc events 2. Use Flink filtering to reduce the messages ● A Scheduled task to check and repair consistency Storage Acceleration—Consistency HDFS Audit Log Cache Manager A new Kafka topic Flink
  • 17. • 8 merged, 1 WIP, 1 fixed by Alluxio. TYPE PR STATUS core Fix master down when master change to leader merged Hadoop 2.10 Fix HdfsVersion miss hadoop 2.10 config merged Fix integration/yarn/pom.xml enforcer-plugin miss hadoop 2.10.x config merged Fix common.go miss hadoop 2.10 configuration merged Command Line Improve shell command support ebj nameservice merged Fix for Alluxio.logs.dir fixed by Alluxio Modify the meaning of variables more clearly merged Web Page Fix isMounted should not invoke ufs, if not /metrics page very slowly merged Fix FormatUtils.getSizeFromBytes method should supports EB merged NameServices Fix unescape the ufs url of Alluxio fsadmin report metrics result WIP Storage Acceleration—Community Contribution
  • 18. Private & Confidential 18 1 Storage Situation 2 3 Storage Acceleration Storage Servitization Storage Acceleration and Servitization at Shopee
  • 19. Private & Confidential 19 Storage Servitization—Status ▪ Most of data is stored in HDFS ▪ Various development languages are used ▪ HDFS has insufficient support for non Java clients ▪ Many applications need to access data as a service, not like a hard disk
  • 20. Private & Confidential 20 Fuse for HDFS S3 for HDFS ▪ Alluxio fuse service on physical machine ▪ Alluxio fuse service on kubernetes cluster ▪ Using S3 API to access HDFS by alluxio proxy service Storage Servitization—Solutions
  • 21. Private & Confidential 21 ▪ Bucket: A bucket is a container for objects stored in Amazon S3 ▪ Object: Objects are the fundamental entities stored in Amazon S3 ▪ Key: An object key (or key name) is the unique identifier for an object within a bucket. ▪ Region: You can choose a region to store the created buckets Store Servitization—S3 Buckets Objects Keys Regions Amazon S3 Concepts Conception
  • 22. Private & Confidential 22 ▪ Alluxio can mount HDFS data ▪ Alluxio provides Proxy service ▪ Proxy is compatible with the basic operations of the S3 API ▪ S3 SDK supports many development languages Store Servitization—S3 for HDFS Access HDFS data via Alluxio using S3 protocol
  • 23. Private & Confidential 23 ▪ 1-level directory as bucket ▪ Subdirectories and file paths as key Store Servitization—Alluxio Proxy for S3 mapping
  • 24. Private & Confidential 24 Store Servitization—Proxy Authentication ▪ Authentication parser ▪ Validator ▪ Secret Manager ▪ Signature Calculation
  • 25. Private & Confidential 25 Store Servitization—Service Architecture
  • 26. Private & Confidential 26 Store Servitization—Community contribution TYPE PR STATUS proxy Fix wrong format of s3 bucket creationDate merged Support parse authorization headers for s3 proxy merged Add s3 rest service audit log merged Add header parameter 'Authorization' for postBucket method merged fuse Fix wrong method call to get username and wrong parameter assignment merged Load jnr-runtime dependencies at initialization merged Support overwrite for rename merged csi Replace invalid env with args in nodeserver merged core Avoid checking file permissions in getFileInfo method merged doc Fix bug case of S3 REST API merged Fix wrong file name in k8s doc merged Fix ambiguous description for impersonation in CN doc merged ▪ 12 merged.
  • 27. Private & Confidential 27 Thank You Storage Acceleration and Servitization at Shopee
  • 28. Private & Confidential 28 ▪ Kernel ▪ User-level daemon High-Level Architecture Storage Servitization—Fuse WHAT IS IT ▪ FileSystem in Userspace
  • 29. Private & Confidential 29 ▪ libfuse ▪ JNR-Fuse ▪ JNI-Fuse Requirements Implementation Storage Servitization—Alluxio Fuse ▪ Standalone Fuse ▪ Fuse on Workers Deployment ▪ Not support random writes Limitations
  • 30. Private & Confidential 30 Store Servitization—Alluxio CSI ▪ On nodeserver pod ▪ On separate pod Fuse Deployment mode WHAT IS IT ▪ Standard storage interface for containers
  • 31. Private & Confidential 31 ▪ Fuse sidecar container in a Pod to mount the Alluxio directory ▪ Independent configuration of pods, high flexibility ▪ Each Pod runs a Fuse container without affecting each other ▪ Each Fuse process occupies a container, so the solution consumes more resources Futures Store Servitization—k8s sidecar for Alluxio WHAT IS IT
  • 32. Private & Confidential 32 Store Servitization—Summarize Fuse on physical machine K8s-csi K8s-sidecar Fuse on nodeserver pod Fuse on separate pod maintenance cost high low higher higher resource usage low lower high high independence high low high high stability high low high high