SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
USENIX  NSDI2016
Session:  Resource  Sharing
2016-‐‑‒05-‐‑‒29  @oraccha
Co-‐‑‒located  Events
• ACM  Symposium  on  SDN  Research  2016  (SOSR),  March  13-‐‑‒17
• 2016  Open  Networking  Summit  (ONS),  March  14-‐‑‒17
• The  12th  ACM/IEEE  Symposium  on  Architectures  for  Networking  
and  Communications   Systems  (ANCSʼ’16),  March  17-‐‑‒19
• The  13th  USENIX  Symposium  on  Networked  Systems  Design  and  
Implementation  (NSDIʼ’16)  
• The  USENIX  Workshop  on  Cool  Topics  in  Sustainable  Data  
Centers  (CoolDCʼ’16),   March  19
2
Session:  Resource  Sharing
• “Ernest:  Efficient  Performance  Prediction  for  Large-‐‑‒Scale  Advanced  
Analytics,”  Shivaram Venkataraman,  Zongheng Yang,  Michael  Franklin,  
Benjamin  Recht,  and  Ion  Stoica,  University  of  California,  Berkeley
• “Cliffhanger:  Scaling  Performance  Cliffs  in  Web  Memory  Caches,”  
Asaf Cidon and  Assaf Eisenman,  Stanford  University;  Mohammad  
Alizadeh,  MIT  CSAIL;  Sachin Katti,  Stanford  University
• “FairRide:  Near-‐‑‒Optimal,  Fair  Cache  Sharing,”  Qifan Pu  and  Haoyuan
Li,  University  of  California,  Berkeley;  Matei Zaharia,  Massachusetts  
Institute  of  Technology;  Ali  Ghodsi and  Ion  Stoica,  University  of  California,  
Berkeley
• “HUG:  Multi-‐‑‒Resource  Fairness  for  Correlated  and  Elastic  Demands,”  
Mosharaf Chowdhury,  University  of  Michigan;  Zhenhua Liu,  Stony  Brook  
University;  Ali  Ghodsi and  Ion  Stoica,  University  of  California,  Berkeley,  
and  Databricks Inc.
3
Ernest:  Efficient  Performance  Prediction  for  
Large-‐‑‒Scale  Advanced  Analytics
• Who?:SparkやMesos等で知られるUCB  AMPLabの⼤大学院⽣生。⼤大規模
データ分析に対するシステムやアルゴリズムが専⾨門で、SoCC12、
EuroSys13、OSDI14、SIGMOD16等で発表あり。
• What?:クラウド環境における機械学習、ゲノム解析などのデータ分析
ワークロードを効率率率的に性能予測するフレームワークの提案
4
DO CHOICES MATTER ?
0
5
10
15
20
25
30
Time(s)
1 r3.8xlarge
2 r3.4xlarge
4 r3.2xlarge
8 r3.xlarge
16 r3.large
Matrix Multiply: 400K by 1K 
0
5
10
15
20
25
30
35
Time(s)
QR Factorization 1M by 1K 
Network Bound
 Mem Bandwidth Bound
DO CHOICES MATTER ? MATRIX MULTIPLY
10
15
20
25
30
Time(s)
1 r3.8xlarge
2 r3.4xlarge
4 r3.2xlarge
8 r3.xlarge
Matrix size: 400K by 1K 
Cores = 16
Memory = 244 GB
Cost = $2.66/hr
Cosine
Transform
Normalization
Linear Solver
~100 iterations
Iterative 
(each iteration many jobs)
Long Running à Expensive
Numerically Intensive
7
Keystone-ML TIMIT PIPELINE
Raw
Data
Properties
0
10
20
30
0
 100
 200
 300
 400
 500
 600
Time(s)
Cores
Actual
 Ideal
r3.4xlarge instances, QR Factorization:1M by 1K 
13
Do choices MATTER ?
Computation + Communication à Non-linear Scaling
Ernest:  Efficient  Performance  Prediction  for  
Large-‐‑‒Scale  Advanced  Analytics
5
• How?:⼩小規模なTraining  jobの実⾏行行結果から性能を予測。実験計画法
を使ってTraining  job数を削減。
OPTIMAL Design of EXPERIMENTS
1%
2%
4%
8%
1
 2
 4
 8
Input
Machines
Use off-the-shelf solver
(CVX)
USING ERNEST
Training
Jobs
Job
Binary
Machines,
Input Size 
Linear
Model
Experiment
Design
Use few iterations for
training
0
200
400
600
800
1000
1
 30
 900
Time
Machines
ERNEST
BASIC Model
time = x1 + x2 ∗
input
machines
+ x3 ∗ log(machines)+ x4 ∗ (machines)
Serial
Execution
Computation (linear)
Tree DAG
All-to-One DAG
Collect Training Data
 Fit Linear Regression
Ernest:  Efficient  Performance  Prediction  for  
Large-‐‑‒Scale  Advanced  Analytics
• Results:
6
TRAINING TIME: Keystone-ml
TIMIT Pipeline on r3.xlarge instances, 100 iterations
29
7 data points
Up to 16 machines
Up to 10% data
EXPERIMENT DESIGN
0
 1000
 2000
 3000
 4000
 5000
 6000
42 machines
Time (s)
Training Time
Running Time
0%
 20%
 40%
 60%
 80%
 100%
Regression
Classification
KMeans
PCA
TIMIT
Prediction Error (%)
Experiment Design
Cost-based
Is Experiment Design useful ?
30
Cliffhanger:  Scaling  Performance  Cliffs  
in  Web  Memory  Caches
• Who?:Stanford  CS出⾝身で、現在はクラウドセキュリティ会社Sookasa
のCEO(共同創業者)。クラウドストレージが専⾨門、SIGCOMM12、
USENIX  ATC13,  15で発表あり。
• What?:Performance  cliffに対する、Memcachedの動的キャッシュ割
当て機構(Slab  allocator)の改良良
70 2000 4000 6000 8000 10000 12000 14000 16000 18000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Number of Items in LRU Queue
Hitrate
Concave Hull
Application 19, Slab 0
Performance  Cliff,  
Talus[HPCA15]
+1  cache  hit-‐‑‒rate
↓
+35%  speedup
The  cache  hit-‐‑‒rate  of  
Facebookʼ’s  Memcached pool
is  98.2%[SIGMETRICS12]
Hit-‐‑‒rate  Curve
Cliffhanger:  Scaling  Performance  Cliffs  
in  Web  Memory  Caches
• How?:shadow  queues
– Hill  climbing  algorithm:  Hit  rate  curveの勾配の⼩小さいqueue  (slab)から⼤大
きいqueueにメモリを回す。
– Cliff  scaling  algorithm:  performance   cliff(凹区間)の始まりと終わりを⾒見見
つける。
8
Using&Shadow&Queues&to&Estimate&
Local&Gradient
823221
879
53
Queue$1
Queue$2
Physical$Queue Shadow$Queue
Physical$Queue Shadow$Queue
Credits
Queue&1 2
Queue&2 @2
1
Resize$Queues
Cliffhanger+Runs+Both+Algorithms+in+
Parallel
Par$$oned)
Original)Queue)
Par$$oned)
Queues)
Track)le4)of)pointer)
Track)le4)of)pointer)
Track)right)of)pointer)
Track)right)of)pointer)
Track)hill)climbing)
Track)hill)climbing)
• Algorithm+1:+incrementally+optimize+memory+
across+queues
– Across+slab+classes
– Across+applications
• Algorithm+2:+scales+performance+cliffs
Cliffhanger:  Scaling  Performance  Cliffs  
in  Web  Memory  Caches
• 汎⽤用に使えそうな技術。次の発表のFairRideのようなFairnessに対する
考慮はない。
9
Cliffhanger+Reduces+Misses+and+Can+
Save+Memory
• Average+misses+reduced:+36.7%
• Average+potential+memory+savings:+45%
Cliffhanger+Outperforms+Default+and+
Optimized+Schemes
• Average+Cliffhanger+hit+rate+increase:+1.2%
FairRide:  Near-‐‑‒Optimal,  Fair  Cache  
Sharing
• Who?:UCB  AMPLabの⼤大学院⽣生。MobiCom13、SIGCOMM15で発表
あり。
• What?:Isolation  guaranteeとStrategy  proofnessを満たし、Pareto  
Efficiencyを準最適にするファイルキャッシュポリシの提案。
106
… … …
Statically allocated
*
Globally shared
Cache
Backend (storage/network)
… … …
Backend (storage/network)
CacheCacheCache
What we want
Isolation
Strategy-proof
Higher utilization
Share data
Isolation
Guarantee
Strategy
Proofness
Pareto
Efficiency
✓ ✓max-min fairness ✗
priority allocation
max-min rate
✗ ✓ ✓
✓✗ ✗
static allocation ✓ ✓ ✗
Isolation
Guarantee
Strategy
Proofness
Pareto
Efficiency
106
Properties
FairRide ✓ ✓ Near-optimal
SIP定理理:ファイル共有において
下記の三つは同時に満たせない
FairRide:  Near-‐‑‒Optimal,  Fair  Cache  
Sharing
• How?
– Max-‐‑‒minポリシにProbabilistic  blockingを導⼊入することでチートに対する
dis-‐‑‒incentiveを与える。
– Alluxio (Tachyon)[SoCC14]ベースに実装。
11
LEGEND
A
C
5
5
A
B
C
5
5
10
B
A
B
C
5
5
10
true access
free-ride
cheat
blocked
Figure 3: Example with 2 users, 3 files and total cache
size of 2. Numbers represent access frequencies. (a). Al-
to get 1 hit/sec access rate for a unit file. To
mize over the utility, which is defined as the to
rate, a user’s optimal strategy is not to cache th
that one has highest access frequencies, but the
with lowest cost/(hit/sec). Compare a file of 10
shared by 2 users and another file of 100MB, share
users. Even though a user access the former 10 tim
and the latter only 8 times/sec, it is overall eco
to cache the second file (comparing 5MB/(hit/se
2.5MB/(hit/sec)).
(a)  Max-‐‑‒min  
fairness
(b)  second  user
makes  cheating
(c)  blocking  free-‐‑‒
riding  access
Probabilistic blocking
• FairRide blocks a user with p(nj) = 1/(nj+1) probability
– nj is number of other users caching file j
– e.g., p(1)=50%, p(4)=20%
• The best you can do in a general case
– Less blocking does not prevent cheating
25
FairRide:  Near-‐‑‒Optimal,  Fair  Cache  
Sharing
12
0
15
30
45
60
0 150 300 450 600 750 900 1050
missratio(%)
Time (s)
user 1
user 2
Cheating under FairRide
user 2 cheats
user 1 cheats
32
FairRide dis-incentives users from cheating.
400
300
200
100
0
Avg.response(ms)
Facebook experiments
FairRide outperforms max-min fairness by 29%
34
0
15
30
45
60
1-10 11-50 51-100 101-500 501-
RedcutioninMedian
JobTime(%)
Bin (#Tasks)
max-min
FairRide
HUG:  Multi-‐‑‒Resource  Fairness  for  
Correlated  and  Elastic  Demands
• Who?:ミシガン⼤大の助教。UCB  AMPLab出⾝身。ネットワークが専⾨門
(coflow-‐‑‒based  networking,  multi-‐‑‒resource  allocation  in  dataceters,  
compute  and  storage  for  big  data,  network  virtualization)でSIGCOMM
で毎年年のように発表。DRF[NSDI11]、FairCloud[SIGCOMM12]の発展。
• What?:ネットワーク帯域の割当て最適化問題
13
…
M1 M2 M3 MN
Congestion-Less Core
L1 L2 L3 LNLN+1 LN+2 LN+3 L2N
How to share the links
between multiple
tenants to provide
1. optimal performance
guarantees and
2. maximize utilization?
Tenant-A’s VMs
Tenant-B’s VMs
HUG:  Multi-‐‑‒Resource  Fairness  for  
Correlated  and  Elastic  Demands
• Highest  Utilization  with  the  Optimal  Isolation  Guarantee  
14
Isolation Guarantee
Utilization
Work-
Conserving
Low
Low Optimal
PS-P
DRF
Per-Flow Fairness
HUG
HUG in Cooperative Setting
1. Optimal Isolation
Guarantee
2. Work Conservation
Isolation Guarantee
Utilization
Work-
Conserving
Low
Low Optimal
PS-P
DRF
Per-Flow Fairness
HUG
1. Optimal Isolation
Guarantee
2. HighestUtilization
3. Strategyproof
HUG in Non-Cooperative Setting
Intuitively, we want to maximize the minimum
progress over all tenants, i.e., maximize mink Mk,
where mink Mk corresponds to the isolation guaran-
tee of an allocation algorithm. We make three observa-
tions. First, when there is a single link in the system,
this model trivially reduces to max-min fairness. Sec-
ond, getting more aggregate bandwidth is not always bet-
ter. For tenant-A in the example, ⟨50Mbps, 100Mbps⟩ is
better than ⟨90Mbps, 90Mbps⟩ or ⟨25Mbps, 200Mbps⟩,
even though the latter ones have more bandwidth in to-
tal. Third, simply applying max-min fairness to individ-
ual links is not enough. In our example, max-min fairness
allocates equal resources to both tenants on both links,
resulting in allocations ⟨1
2 , 1
2 ⟩ on both links (Figure 1b).
Corresponding progress (MA = MB = 1
2 ) result in a
suboptimal isolation guarantee (min{MA, MB} = 1
2 ).
Dominant Resource Fairness (DRF) [33] extends max-
min fairness to multiple resources and prevents such sub-
Cloud Network Sharing
Dynamic Sharing
Flow-Level
(Per-Flow Fairness)
No isolation guarantee
VM-Level
(Seawall, GateKeeper)
No isolation guarantee
Tenant-/Network-Level
Non-Cooperative
Environments
Require
strategy-proofness
Highest Utilization for
Optimal IsolationGuarantee
(HUG)
Cooperative
Environments
Do not require
strategy-proofness
Reservation
(SecondNet, Oktopus, Pulsar, Silo)
Uses admission control
Low
Utilization
(DRF)
Optimal isolation guarantee
Work-Conserving
Optimal Isolation Guarantee
(HUG)
Suboptimal
IsolationGuarantee
(PS-P, EyeQ, NetShare)
Work-conserving
HUG:  Multi-‐‑‒Resource  Fairness  for  
Correlated  and  Elastic  Demands
• 100台のEC2インスタンスで実験。
• 3つのテナント
– テナントA、C:pairwise  one-‐‑‒to-‐‑‒one  communication
– テナントB:all-‐‑‒to-‐‑‒all  communication
15
0
50
100
0 60 120 180 240 300 360 420 480 540
TotalAlloc(Gbps)
Time (Seconds)
Tenant A
Tenant B
Tenant C
(a) Per-flow Fairness (TCP)
0
50
100
0 60 120 180 240 300 360 420 480 540
TotalAlloc(Gbps)
Time (Seconds)
Tenant A
Tenant B
Tenant C
(b) HUG
Figure 10: [EC2] Bandwidth consumptions of three tenants arriving over time in a 100-machine EC2 cluster. Each tenant has 100
VMs, but each uses a different communication pattern (§5.1.1). We observe that (a) using TCP, tenant-B dominates the network by
creating more flows; (b) HUG isolates tenants A and C from tenant B.
感想
• 本セッションの対象はデータセンタ内の資源管理理
• ⾰革新的なアイデアがあるわけではなくが、問題をきちんと定式化し、そ
れに基づいて実⽤用的なシステムを構築するという研究のお⼿手本のような
論論⽂文が多い。さすがNSDI。
• シングルセッションで全発表を聞けるのはうれしいが、発表時間20分
は短い(スライドだけ⾒見見てもよくわからないところがある)
• UCB  AMPLab強い
• Facebook  trace  data欲しい
16
本資料料で使⽤用したすべての図はNSDI2016ホームページの
proceedingsおよびslidesから引⽤用しました。

Weitere ähnliche Inhalte

Was ist angesagt?

Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchBruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchDanny Abukalam
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCRyousei Takano
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadeaviadea
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cachergrebski
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerDanny Abukalam
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformGanesan Narayanasamy
 
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCScale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCinside-BigData.com
 
Evolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO VisorEvolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO VisorLarry Lang
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리NAVER D2
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlapinside-BigData.com
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Kohei KaiGai
 
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)Kohei KaiGai
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Cloud Native Day Tel Aviv
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価Ryousei Takano
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitCarlo C. del Mundo
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015Kohei KaiGai
 
Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9inside-BigData.com
 

Was ist angesagt? (20)

Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchBruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputer
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
 
Cassandra at teads
Cassandra at teadsCassandra at teads
Cassandra at teads
 
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOCScale-out AI Training on Massive Core System from HPC to Fabric-based SOC
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
 
Evolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO VisorEvolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO Visor
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlap
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
Technology Updates of PG-Strom at Aug-2014 (PGUnconf@Tokyo)
 
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
 
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing UnitSlides for In-Datacenter Performance Analysis of a Tensor Processing Unit
Slides for In-Datacenter Performance Analysis of a Tensor Processing Unit
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
 
Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9
 

Ähnlich wie USENIX NSDI 2016 (Session: Resource Sharing)

Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesAmazon Web Services
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model CompressionApache MXNet
 
The hidden engineering behind machine learning products at Helixa
The hidden engineering behind machine learning products at HelixaThe hidden engineering behind machine learning products at Helixa
The hidden engineering behind machine learning products at HelixaAlluxio, Inc.
 
SoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingSoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingNetronome
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
Aerospike for machine learning
Aerospike for machine learningAerospike for machine learning
Aerospike for machine learningAerospike
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013Cliff Kinard
 
Security TechTalk | AWS Public Sector Summit 2016
Security TechTalk | AWS Public Sector Summit 2016Security TechTalk | AWS Public Sector Summit 2016
Security TechTalk | AWS Public Sector Summit 2016Amazon Web Services
 
StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9StorPool Storage
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersIRJET Journal
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
 

Ähnlich wie USENIX NSDI 2016 (Session: Resource Sharing) (20)

Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
 
The hidden engineering behind machine learning products at Helixa
The hidden engineering behind machine learning products at HelixaThe hidden engineering behind machine learning products at Helixa
The hidden engineering behind machine learning products at Helixa
 
SoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingSoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based Networking
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
Aerospike for machine learning
Aerospike for machine learningAerospike for machine learning
Aerospike for machine learning
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013
 
Security TechTalk | AWS Public Sector Summit 2016
Security TechTalk | AWS Public Sector Summit 2016Security TechTalk | AWS Public Sector Summit 2016
Security TechTalk | AWS Public Sector Summit 2016
 
StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9StorPool Presents at Cloud Field Day 9
StorPool Presents at Cloud Field Day 9
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using Containers
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 

Mehr von Ryousei Takano

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive ComputingRyousei Takano
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIRyousei Takano
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentRyousei Takano
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksRyousei Takano
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何かRyousei Takano
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...Ryousei Takano
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~Ryousei Takano
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green CloudRyousei Takano
 
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...Ryousei Takano
 
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システムRyousei Takano
 
SoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksSoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksRyousei Takano
 
異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構Ryousei Takano
 
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式Ryousei Takano
 
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ryousei Takano
 
インタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムインタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムRyousei Takano
 
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Ryousei Takano
 
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御Ryousei Takano
 

Mehr von Ryousei Takano (19)

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive Computing
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCI
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
 
ABCI Data Center
ABCI Data CenterABCI Data Center
ABCI Data Center
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center Networks
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
 
IEEE/ACM SC2013報告
IEEE/ACM SC2013報告IEEE/ACM SC2013報告
IEEE/ACM SC2013報告
 
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
 
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
 
SoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksSoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired Networks
 
異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構
 
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
 
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
 
インタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムインタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システム
 
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
 
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
 

Kürzlich hochgeladen

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

USENIX NSDI 2016 (Session: Resource Sharing)

  • 1. USENIX  NSDI2016 Session:  Resource  Sharing 2016-‐‑‒05-‐‑‒29  @oraccha
  • 2. Co-‐‑‒located  Events • ACM  Symposium  on  SDN  Research  2016  (SOSR),  March  13-‐‑‒17 • 2016  Open  Networking  Summit  (ONS),  March  14-‐‑‒17 • The  12th  ACM/IEEE  Symposium  on  Architectures  for  Networking   and  Communications   Systems  (ANCSʼ’16),  March  17-‐‑‒19 • The  13th  USENIX  Symposium  on  Networked  Systems  Design  and   Implementation  (NSDIʼ’16)   • The  USENIX  Workshop  on  Cool  Topics  in  Sustainable  Data   Centers  (CoolDCʼ’16),   March  19 2
  • 3. Session:  Resource  Sharing • “Ernest:  Efficient  Performance  Prediction  for  Large-‐‑‒Scale  Advanced   Analytics,”  Shivaram Venkataraman,  Zongheng Yang,  Michael  Franklin,   Benjamin  Recht,  and  Ion  Stoica,  University  of  California,  Berkeley • “Cliffhanger:  Scaling  Performance  Cliffs  in  Web  Memory  Caches,”   Asaf Cidon and  Assaf Eisenman,  Stanford  University;  Mohammad   Alizadeh,  MIT  CSAIL;  Sachin Katti,  Stanford  University • “FairRide:  Near-‐‑‒Optimal,  Fair  Cache  Sharing,”  Qifan Pu  and  Haoyuan Li,  University  of  California,  Berkeley;  Matei Zaharia,  Massachusetts   Institute  of  Technology;  Ali  Ghodsi and  Ion  Stoica,  University  of  California,   Berkeley • “HUG:  Multi-‐‑‒Resource  Fairness  for  Correlated  and  Elastic  Demands,”   Mosharaf Chowdhury,  University  of  Michigan;  Zhenhua Liu,  Stony  Brook   University;  Ali  Ghodsi and  Ion  Stoica,  University  of  California,  Berkeley,   and  Databricks Inc. 3
  • 4. Ernest:  Efficient  Performance  Prediction  for   Large-‐‑‒Scale  Advanced  Analytics • Who?:SparkやMesos等で知られるUCB  AMPLabの⼤大学院⽣生。⼤大規模 データ分析に対するシステムやアルゴリズムが専⾨門で、SoCC12、 EuroSys13、OSDI14、SIGMOD16等で発表あり。 • What?:クラウド環境における機械学習、ゲノム解析などのデータ分析 ワークロードを効率率率的に性能予測するフレームワークの提案 4 DO CHOICES MATTER ? 0 5 10 15 20 25 30 Time(s) 1 r3.8xlarge 2 r3.4xlarge 4 r3.2xlarge 8 r3.xlarge 16 r3.large Matrix Multiply: 400K by 1K 0 5 10 15 20 25 30 35 Time(s) QR Factorization 1M by 1K Network Bound Mem Bandwidth Bound DO CHOICES MATTER ? MATRIX MULTIPLY 10 15 20 25 30 Time(s) 1 r3.8xlarge 2 r3.4xlarge 4 r3.2xlarge 8 r3.xlarge Matrix size: 400K by 1K Cores = 16 Memory = 244 GB Cost = $2.66/hr Cosine Transform Normalization Linear Solver ~100 iterations Iterative (each iteration many jobs) Long Running à Expensive Numerically Intensive 7 Keystone-ML TIMIT PIPELINE Raw Data Properties 0 10 20 30 0 100 200 300 400 500 600 Time(s) Cores Actual Ideal r3.4xlarge instances, QR Factorization:1M by 1K 13 Do choices MATTER ? Computation + Communication à Non-linear Scaling
  • 5. Ernest:  Efficient  Performance  Prediction  for   Large-‐‑‒Scale  Advanced  Analytics 5 • How?:⼩小規模なTraining  jobの実⾏行行結果から性能を予測。実験計画法 を使ってTraining  job数を削減。 OPTIMAL Design of EXPERIMENTS 1% 2% 4% 8% 1 2 4 8 Input Machines Use off-the-shelf solver (CVX) USING ERNEST Training Jobs Job Binary Machines, Input Size Linear Model Experiment Design Use few iterations for training 0 200 400 600 800 1000 1 30 900 Time Machines ERNEST BASIC Model time = x1 + x2 ∗ input machines + x3 ∗ log(machines)+ x4 ∗ (machines) Serial Execution Computation (linear) Tree DAG All-to-One DAG Collect Training Data Fit Linear Regression
  • 6. Ernest:  Efficient  Performance  Prediction  for   Large-‐‑‒Scale  Advanced  Analytics • Results: 6 TRAINING TIME: Keystone-ml TIMIT Pipeline on r3.xlarge instances, 100 iterations 29 7 data points Up to 16 machines Up to 10% data EXPERIMENT DESIGN 0 1000 2000 3000 4000 5000 6000 42 machines Time (s) Training Time Running Time 0% 20% 40% 60% 80% 100% Regression Classification KMeans PCA TIMIT Prediction Error (%) Experiment Design Cost-based Is Experiment Design useful ? 30
  • 7. Cliffhanger:  Scaling  Performance  Cliffs   in  Web  Memory  Caches • Who?:Stanford  CS出⾝身で、現在はクラウドセキュリティ会社Sookasa のCEO(共同創業者)。クラウドストレージが専⾨門、SIGCOMM12、 USENIX  ATC13,  15で発表あり。 • What?:Performance  cliffに対する、Memcachedの動的キャッシュ割 当て機構(Slab  allocator)の改良良 70 2000 4000 6000 8000 10000 12000 14000 16000 18000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Number of Items in LRU Queue Hitrate Concave Hull Application 19, Slab 0 Performance  Cliff,   Talus[HPCA15] +1  cache  hit-‐‑‒rate ↓ +35%  speedup The  cache  hit-‐‑‒rate  of   Facebookʼ’s  Memcached pool is  98.2%[SIGMETRICS12] Hit-‐‑‒rate  Curve
  • 8. Cliffhanger:  Scaling  Performance  Cliffs   in  Web  Memory  Caches • How?:shadow  queues – Hill  climbing  algorithm:  Hit  rate  curveの勾配の⼩小さいqueue  (slab)から⼤大 きいqueueにメモリを回す。 – Cliff  scaling  algorithm:  performance   cliff(凹区間)の始まりと終わりを⾒見見 つける。 8 Using&Shadow&Queues&to&Estimate& Local&Gradient 823221 879 53 Queue$1 Queue$2 Physical$Queue Shadow$Queue Physical$Queue Shadow$Queue Credits Queue&1 2 Queue&2 @2 1 Resize$Queues Cliffhanger+Runs+Both+Algorithms+in+ Parallel Par$$oned) Original)Queue) Par$$oned) Queues) Track)le4)of)pointer) Track)le4)of)pointer) Track)right)of)pointer) Track)right)of)pointer) Track)hill)climbing) Track)hill)climbing) • Algorithm+1:+incrementally+optimize+memory+ across+queues – Across+slab+classes – Across+applications • Algorithm+2:+scales+performance+cliffs
  • 9. Cliffhanger:  Scaling  Performance  Cliffs   in  Web  Memory  Caches • 汎⽤用に使えそうな技術。次の発表のFairRideのようなFairnessに対する 考慮はない。 9 Cliffhanger+Reduces+Misses+and+Can+ Save+Memory • Average+misses+reduced:+36.7% • Average+potential+memory+savings:+45% Cliffhanger+Outperforms+Default+and+ Optimized+Schemes • Average+Cliffhanger+hit+rate+increase:+1.2%
  • 10. FairRide:  Near-‐‑‒Optimal,  Fair  Cache   Sharing • Who?:UCB  AMPLabの⼤大学院⽣生。MobiCom13、SIGCOMM15で発表 あり。 • What?:Isolation  guaranteeとStrategy  proofnessを満たし、Pareto   Efficiencyを準最適にするファイルキャッシュポリシの提案。 106 … … … Statically allocated * Globally shared Cache Backend (storage/network) … … … Backend (storage/network) CacheCacheCache What we want Isolation Strategy-proof Higher utilization Share data Isolation Guarantee Strategy Proofness Pareto Efficiency ✓ ✓max-min fairness ✗ priority allocation max-min rate ✗ ✓ ✓ ✓✗ ✗ static allocation ✓ ✓ ✗ Isolation Guarantee Strategy Proofness Pareto Efficiency 106 Properties FairRide ✓ ✓ Near-optimal SIP定理理:ファイル共有において 下記の三つは同時に満たせない
  • 11. FairRide:  Near-‐‑‒Optimal,  Fair  Cache   Sharing • How? – Max-‐‑‒minポリシにProbabilistic  blockingを導⼊入することでチートに対する dis-‐‑‒incentiveを与える。 – Alluxio (Tachyon)[SoCC14]ベースに実装。 11 LEGEND A C 5 5 A B C 5 5 10 B A B C 5 5 10 true access free-ride cheat blocked Figure 3: Example with 2 users, 3 files and total cache size of 2. Numbers represent access frequencies. (a). Al- to get 1 hit/sec access rate for a unit file. To mize over the utility, which is defined as the to rate, a user’s optimal strategy is not to cache th that one has highest access frequencies, but the with lowest cost/(hit/sec). Compare a file of 10 shared by 2 users and another file of 100MB, share users. Even though a user access the former 10 tim and the latter only 8 times/sec, it is overall eco to cache the second file (comparing 5MB/(hit/se 2.5MB/(hit/sec)). (a)  Max-‐‑‒min   fairness (b)  second  user makes  cheating (c)  blocking  free-‐‑‒ riding  access Probabilistic blocking • FairRide blocks a user with p(nj) = 1/(nj+1) probability – nj is number of other users caching file j – e.g., p(1)=50%, p(4)=20% • The best you can do in a general case – Less blocking does not prevent cheating 25
  • 12. FairRide:  Near-‐‑‒Optimal,  Fair  Cache   Sharing 12 0 15 30 45 60 0 150 300 450 600 750 900 1050 missratio(%) Time (s) user 1 user 2 Cheating under FairRide user 2 cheats user 1 cheats 32 FairRide dis-incentives users from cheating. 400 300 200 100 0 Avg.response(ms) Facebook experiments FairRide outperforms max-min fairness by 29% 34 0 15 30 45 60 1-10 11-50 51-100 101-500 501- RedcutioninMedian JobTime(%) Bin (#Tasks) max-min FairRide
  • 13. HUG:  Multi-‐‑‒Resource  Fairness  for   Correlated  and  Elastic  Demands • Who?:ミシガン⼤大の助教。UCB  AMPLab出⾝身。ネットワークが専⾨門 (coflow-‐‑‒based  networking,  multi-‐‑‒resource  allocation  in  dataceters,   compute  and  storage  for  big  data,  network  virtualization)でSIGCOMM で毎年年のように発表。DRF[NSDI11]、FairCloud[SIGCOMM12]の発展。 • What?:ネットワーク帯域の割当て最適化問題 13 … M1 M2 M3 MN Congestion-Less Core L1 L2 L3 LNLN+1 LN+2 LN+3 L2N How to share the links between multiple tenants to provide 1. optimal performance guarantees and 2. maximize utilization? Tenant-A’s VMs Tenant-B’s VMs
  • 14. HUG:  Multi-‐‑‒Resource  Fairness  for   Correlated  and  Elastic  Demands • Highest  Utilization  with  the  Optimal  Isolation  Guarantee   14 Isolation Guarantee Utilization Work- Conserving Low Low Optimal PS-P DRF Per-Flow Fairness HUG HUG in Cooperative Setting 1. Optimal Isolation Guarantee 2. Work Conservation Isolation Guarantee Utilization Work- Conserving Low Low Optimal PS-P DRF Per-Flow Fairness HUG 1. Optimal Isolation Guarantee 2. HighestUtilization 3. Strategyproof HUG in Non-Cooperative Setting Intuitively, we want to maximize the minimum progress over all tenants, i.e., maximize mink Mk, where mink Mk corresponds to the isolation guaran- tee of an allocation algorithm. We make three observa- tions. First, when there is a single link in the system, this model trivially reduces to max-min fairness. Sec- ond, getting more aggregate bandwidth is not always bet- ter. For tenant-A in the example, ⟨50Mbps, 100Mbps⟩ is better than ⟨90Mbps, 90Mbps⟩ or ⟨25Mbps, 200Mbps⟩, even though the latter ones have more bandwidth in to- tal. Third, simply applying max-min fairness to individ- ual links is not enough. In our example, max-min fairness allocates equal resources to both tenants on both links, resulting in allocations ⟨1 2 , 1 2 ⟩ on both links (Figure 1b). Corresponding progress (MA = MB = 1 2 ) result in a suboptimal isolation guarantee (min{MA, MB} = 1 2 ). Dominant Resource Fairness (DRF) [33] extends max- min fairness to multiple resources and prevents such sub- Cloud Network Sharing Dynamic Sharing Flow-Level (Per-Flow Fairness) No isolation guarantee VM-Level (Seawall, GateKeeper) No isolation guarantee Tenant-/Network-Level Non-Cooperative Environments Require strategy-proofness Highest Utilization for Optimal IsolationGuarantee (HUG) Cooperative Environments Do not require strategy-proofness Reservation (SecondNet, Oktopus, Pulsar, Silo) Uses admission control Low Utilization (DRF) Optimal isolation guarantee Work-Conserving Optimal Isolation Guarantee (HUG) Suboptimal IsolationGuarantee (PS-P, EyeQ, NetShare) Work-conserving
  • 15. HUG:  Multi-‐‑‒Resource  Fairness  for   Correlated  and  Elastic  Demands • 100台のEC2インスタンスで実験。 • 3つのテナント – テナントA、C:pairwise  one-‐‑‒to-‐‑‒one  communication – テナントB:all-‐‑‒to-‐‑‒all  communication 15 0 50 100 0 60 120 180 240 300 360 420 480 540 TotalAlloc(Gbps) Time (Seconds) Tenant A Tenant B Tenant C (a) Per-flow Fairness (TCP) 0 50 100 0 60 120 180 240 300 360 420 480 540 TotalAlloc(Gbps) Time (Seconds) Tenant A Tenant B Tenant C (b) HUG Figure 10: [EC2] Bandwidth consumptions of three tenants arriving over time in a 100-machine EC2 cluster. Each tenant has 100 VMs, but each uses a different communication pattern (§5.1.1). We observe that (a) using TCP, tenant-B dominates the network by creating more flows; (b) HUG isolates tenants A and C from tenant B.
  • 16. 感想 • 本セッションの対象はデータセンタ内の資源管理理 • ⾰革新的なアイデアがあるわけではなくが、問題をきちんと定式化し、そ れに基づいて実⽤用的なシステムを構築するという研究のお⼿手本のような 論論⽂文が多い。さすがNSDI。 • シングルセッションで全発表を聞けるのはうれしいが、発表時間20分 は短い(スライドだけ⾒見見てもよくわからないところがある) • UCB  AMPLab強い • Facebook  trace  data欲しい 16 本資料料で使⽤用したすべての図はNSDI2016ホームページの proceedingsおよびslidesから引⽤用しました。