SlideShare a Scribd company logo
1 of 16
Download to read offline
Leading Solutions
for HPC and Data Centers
Group
2
RSC BasIS Platform: Orchestration for
High Performance Composable Storage
Architectures
Pavel Lavrenko, CBDO, RSC November, 19 2020, DAOS User Group 2020
From Rackscale to Composable
3
Network
Compute
Storage
Storage
Compute
Converged Infrastructure
Current DAOS status
4
DAOS is great but still has a lot of complications:
• DAOS requires specially-designed hardware platforms to deploy
• DAOS deployment is tricky
• DAOS doesn’t fit HPC Cloud
RSC is focused to address these points
4
DAOS Storage system models
Fabric
Pooled Storage Model
Hyper-Converged
Storage Model
Disaggregated Storage Model
(Dedicated servers)
Nodes With PMEM & NVMe
Compute Nodes
Compute with NVMe
Compute with PMEM
Disaggregated Hyperconverged Model
(All servers participate in DAOS)
5
DAOS: Large Capacity Requires Dedicated Storage Servers
NIC
100Gbit/s
Xeon
CPU
x6
NIC
100Gbit/s
Xeon
CPU
x6
PCIe
x4
PCIe
x16
PCIe
x16
PCIe
x4
12x0.5TB =
6TB Max PMEM
Capacity
6TBx16 = 96TB
Max NVMe Capacity
by 6% ratio
x6 8TB
48 PCI
lanes
128 PCI
lanes
x4
x4
128 + 48 = 172 PCI lanes
Bottleneck!
x6 8TB
6
Architecture with NVMe-over-Fabrics
NIC
Xeon
CPU
x6
NIC
Xeon
CPU
x6
PCIe
x16
NIC
NIC
NIC
NIC
NIC
NIC
~10GB/s
PCIe
x16
PCIe
x16
PCIe
x16
full
duplex
64 PCI
lanes
64 PCI
lanes
64+64 = 128 PCI lanes
RemoteDrives
RemoteDrives
7
Client Node
Client Node
DAOS Node
Interconnect utilization NVMe-over-Fabrics
NIC
Client
NVMeoF traffic
100Gbit/s
uplinkCPU
+
PMEM
DAOS Node
NIC
Client
CPU
+
PMEM
Client
Writes to
DAOS
Client Reads
from DAOS
Complete utilization of full
duplex network:
DAOS data and NVMeOF always
move in the opposite directions
Works well when DAOS cluster
uses SSD from client nodes
Extra NVMeoF Latency doesn’t
affect storage performance
because of PMEM
downlink
DAOS objects
traffic
100Gbit/s
NVMeoF traffic
100Gbit/s
downlink
uplink
DAOS objects
traffic
100Gbit/s
≈
≈
8
RSC BasIS Orchestration
Knowledge of objects
• Auto-discovery
• Inventory and classification
• Knowledge of topologies
• Dynamic selection based
on Query language
Continuous configuration
• Repository of configuration
• Maintaining consistency
Group Commands Execution
• Human operator – Platform
• Agent to agent
Monitoring
• Dynamical status representation
• GUI for drill-down analysis
• Problem-oriented dashboards
Vertical integration of
Hardware, Software
and Infrastructure components
Microagent Mesh for Cluster
Automation
Knowledge about all
datacenter objects and their
connections
App Repository
Messaging system
Agents
Agent Lifecycle
SDK 9
10
BasIS Storage Orchestrator
11
BasIS: DAOS with NVMeOF Pipeline
FILTER STORAGE NODES
CONNECT DRIVES TO
SERVERS
RUN SERVICES
CHOOSE CLIENTS
n02p[001-029].nodes
FILTER NVMe Disks
CONNECT CLIENTS TO
DAOS
12
Flexible PMEM-only server roles
PMEM-only ServerDAOS Server
In-Memory DBs
AI
Grid Systems
Storage Server
with PMEM & NVMe
DAOS Server
with NVMeOF Drives
• Less complex
• Cheaper
• More roles
13
IOR results with DFS API
Configurations BW write MB/s BW read MB/s
2 IO instances and 4 local NVMe drives 2132 2008
2 IO instances and 4 NVMeOF drives 2253 2178
4 IO instances and 8 local NVMe drives 4679 3935
4 IO instances and 8 NVMeOF drives 4248 4268
NVMe drives - Intel P4510 2TB - W: 2 GB/s and R: 3.2 GB/s by specs
DAOS: kdev (AIO Linux driver) was used for NVMeOF drives, 2 targets per disk, max pool size, service replica = 1, ofi+sockets provider through Intel Omni-path
MPI: np=104 from 3 clients
14
Conclusions
15
What have we archived with RSC BasIS Storage Orchestration:
• DAOS requires specially-designed hardware platforms to deploy
• You can have DAOS cheap - just buy PMEM and compose DAOS over a fabric
• Existing servers can share their NVMe drives
• DAOS deployment is tricky
• Software orchestration significantly simplifies DAOS deployment
• DAOS doesn’t fit HPC Cloud
• Composable Disaggregated approach gives flexible ways to use PMEM nodes
• DAOS can be dynamically assembled when needed
RSC Announces DAOS Support in its storage orchestration platform
https://www.hpcwire.com/off-the-wire/rsc-announces-intel-ice-lake-sp-and-daos-support-introduces-tornado-afs-storage/
Group
rscgroup.ru
hq@rsc-tech.ru

More Related Content

What's hot

Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSManage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSMesosphere Inc.
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed_Hat_Storage
 
Don’t Leave Bare Metal Workloads Behind
Don’t Leave Bare Metal Workloads BehindDon’t Leave Bare Metal Workloads Behind
Don’t Leave Bare Metal Workloads BehindNEXTtour
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...InfluxData
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specificationsinside-BigData.com
 
Azure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloudAzure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloudICT-Partners
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage MattersColleen Corrice
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure StorageMarketingArrowECS_CZ
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed_Hat_Storage
 
HCI comparison whatmatrix
HCI comparison whatmatrixHCI comparison whatmatrix
HCI comparison whatmatrixRodneyReinhardt
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage
 
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...Red_Hat_Storage
 
OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019MayaData Inc
 
Red Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed_Hat_Storage
 
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red_Hat_Storage
 
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open NetworkingNutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open NetworkingCumulus Networks
 

What's hot (20)

Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSManage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and Future
 
Don’t Leave Bare Metal Workloads Behind
Don’t Leave Bare Metal Workloads BehindDon’t Leave Bare Metal Workloads Behind
Don’t Leave Bare Metal Workloads Behind
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Open ebs 101
Open ebs 101Open ebs 101
Open ebs 101
 
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
 
Ism
IsmIsm
Ism
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specifications
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
Azure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloudAzure en Nutanix: your journey to the hybrid cloud
Azure en Nutanix: your journey to the hybrid cloud
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super Storage
 
HCI comparison whatmatrix
HCI comparison whatmatrixHCI comparison whatmatrix
HCI comparison whatmatrix
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
 
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
 
OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019
 
Red Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for Containers
 
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
 
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open NetworkingNutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
 

Similar to Leading HPC Data Center Solutions Group Showcases DAOS Orchestration

VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld
 
VMworld 2013: IBM Solutions for VMware Virtual SAN
VMworld 2013: IBM Solutions for VMware Virtual SAN VMworld 2013: IBM Solutions for VMware Virtual SAN
VMworld 2013: IBM Solutions for VMware Virtual SAN VMworld
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RSimon Huang
 
Windows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC GroupWindows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC GroupEPC Group
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedis Labs
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY
 
Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...
Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...
Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...Kenneth de Brucq
 
Windows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC GroupWindows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC GroupEPC Group
 
Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System	Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System Great Wide Open
 
Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...
Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...
Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...Kenneth de Brucq
 
Why Software Defined Storage is Critical for Your IT Strategy
Why Software Defined Storage is Critical for Your IT StrategyWhy Software Defined Storage is Critical for Your IT Strategy
Why Software Defined Storage is Critical for Your IT Strategyandreas kuncoro
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees
 
TechWiseTV Workshop: Cisco HyperFlex Systems
TechWiseTV Workshop: Cisco HyperFlex SystemsTechWiseTV Workshop: Cisco HyperFlex Systems
TechWiseTV Workshop: Cisco HyperFlex SystemsRobb Boyd
 
Big Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageBig Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageIntel® Software
 
Deploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and dockerDeploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and dockerVu Nguyen Duy
 
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2Maris Elsins
 
Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...
Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...
Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...Lindsey Aitchison
 
Virtualize with Confidence
Virtualize with ConfidenceVirtualize with Confidence
Virtualize with ConfidenceNetWize
 

Similar to Leading HPC Data Center Solutions Group Showcases DAOS Orchestration (20)

VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
 
Storage spaces direct webinar
Storage spaces direct webinarStorage spaces direct webinar
Storage spaces direct webinar
 
VMworld 2013: IBM Solutions for VMware Virtual SAN
VMworld 2013: IBM Solutions for VMware Virtual SAN VMworld 2013: IBM Solutions for VMware Virtual SAN
VMworld 2013: IBM Solutions for VMware Virtual SAN
 
NVMe over Fabric
NVMe over FabricNVMe over Fabric
NVMe over Fabric
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 
Windows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC GroupWindows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC Group
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power Systems
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...
Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...
Dell Solutions Tour 2015- Dells Storage-strategi - Et hav av muligheter, Clae...
 
Windows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC GroupWindows Server 2012 Deep-Dive - EPC Group
Windows Server 2012 Deep-Dive - EPC Group
 
Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System	Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System
 
Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...
Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...
Dell Solutions Tour 2015 - Azure i ditt eget datasenter, Kristian Nese, CTO L...
 
Why Software Defined Storage is Critical for Your IT Strategy
Why Software Defined Storage is Critical for Your IT StrategyWhy Software Defined Storage is Critical for Your IT Strategy
Why Software Defined Storage is Critical for Your IT Strategy
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 
TechWiseTV Workshop: Cisco HyperFlex Systems
TechWiseTV Workshop: Cisco HyperFlex SystemsTechWiseTV Workshop: Cisco HyperFlex Systems
TechWiseTV Workshop: Cisco HyperFlex Systems
 
Big Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageBig Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object Storage
 
Deploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and dockerDeploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and docker
 
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
 
Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...
Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...
Building an Oracle Grid with Oracle VM on Dell Blade Servers and EqualLogic i...
 
Virtualize with Confidence
Virtualize with ConfidenceVirtualize with Confidence
Virtualize with Confidence
 

More from Andrey Kudryavtsev

DUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansDUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansAndrey Kudryavtsev
 
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterAndrey Kudryavtsev
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...Andrey Kudryavtsev
 
DUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateDUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateAndrey Kudryavtsev
 
DUG'20: 08 - DAOS-SEGY Mapping
DUG'20: 08 - DAOS-SEGY MappingDUG'20: 08 - DAOS-SEGY Mapping
DUG'20: 08 - DAOS-SEGY MappingAndrey Kudryavtsev
 
DUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSDUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSAndrey Kudryavtsev
 
DUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabDUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabAndrey Kudryavtsev
 
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedDUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedAndrey Kudryavtsev
 
DUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateDUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateAndrey Kudryavtsev
 
DUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSDUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSAndrey Kudryavtsev
 
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraDUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraAndrey Kudryavtsev
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateAndrey Kudryavtsev
 

More from Andrey Kudryavtsev (13)

DUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution PlansDUG'20: 13 - HPE’s DAOS Solution Plans
DUG'20: 13 - HPE’s DAOS Solution Plans
 
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
 
DUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateDUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware Update
 
DUG'20: 08 - DAOS-SEGY Mapping
DUG'20: 08 - DAOS-SEGY MappingDUG'20: 08 - DAOS-SEGY Mapping
DUG'20: 08 - DAOS-SEGY Mapping
 
DUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSDUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOS
 
DUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabDUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN Openlab
 
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedDUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
 
DUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateDUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature Update
 
DUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSDUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOS
 
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraDUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS Update
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 

Leading HPC Data Center Solutions Group Showcases DAOS Orchestration

  • 1. Leading Solutions for HPC and Data Centers Group
  • 2. 2 RSC BasIS Platform: Orchestration for High Performance Composable Storage Architectures Pavel Lavrenko, CBDO, RSC November, 19 2020, DAOS User Group 2020
  • 3. From Rackscale to Composable 3 Network Compute Storage Storage Compute Converged Infrastructure
  • 4. Current DAOS status 4 DAOS is great but still has a lot of complications: • DAOS requires specially-designed hardware platforms to deploy • DAOS deployment is tricky • DAOS doesn’t fit HPC Cloud RSC is focused to address these points 4
  • 5. DAOS Storage system models Fabric Pooled Storage Model Hyper-Converged Storage Model Disaggregated Storage Model (Dedicated servers) Nodes With PMEM & NVMe Compute Nodes Compute with NVMe Compute with PMEM Disaggregated Hyperconverged Model (All servers participate in DAOS) 5
  • 6. DAOS: Large Capacity Requires Dedicated Storage Servers NIC 100Gbit/s Xeon CPU x6 NIC 100Gbit/s Xeon CPU x6 PCIe x4 PCIe x16 PCIe x16 PCIe x4 12x0.5TB = 6TB Max PMEM Capacity 6TBx16 = 96TB Max NVMe Capacity by 6% ratio x6 8TB 48 PCI lanes 128 PCI lanes x4 x4 128 + 48 = 172 PCI lanes Bottleneck! x6 8TB 6
  • 8. Client Node Client Node DAOS Node Interconnect utilization NVMe-over-Fabrics NIC Client NVMeoF traffic 100Gbit/s uplinkCPU + PMEM DAOS Node NIC Client CPU + PMEM Client Writes to DAOS Client Reads from DAOS Complete utilization of full duplex network: DAOS data and NVMeOF always move in the opposite directions Works well when DAOS cluster uses SSD from client nodes Extra NVMeoF Latency doesn’t affect storage performance because of PMEM downlink DAOS objects traffic 100Gbit/s NVMeoF traffic 100Gbit/s downlink uplink DAOS objects traffic 100Gbit/s ≈ ≈ 8
  • 9. RSC BasIS Orchestration Knowledge of objects • Auto-discovery • Inventory and classification • Knowledge of topologies • Dynamic selection based on Query language Continuous configuration • Repository of configuration • Maintaining consistency Group Commands Execution • Human operator – Platform • Agent to agent Monitoring • Dynamical status representation • GUI for drill-down analysis • Problem-oriented dashboards Vertical integration of Hardware, Software and Infrastructure components Microagent Mesh for Cluster Automation Knowledge about all datacenter objects and their connections App Repository Messaging system Agents Agent Lifecycle SDK 9
  • 10. 10
  • 12. BasIS: DAOS with NVMeOF Pipeline FILTER STORAGE NODES CONNECT DRIVES TO SERVERS RUN SERVICES CHOOSE CLIENTS n02p[001-029].nodes FILTER NVMe Disks CONNECT CLIENTS TO DAOS 12
  • 13. Flexible PMEM-only server roles PMEM-only ServerDAOS Server In-Memory DBs AI Grid Systems Storage Server with PMEM & NVMe DAOS Server with NVMeOF Drives • Less complex • Cheaper • More roles 13
  • 14. IOR results with DFS API Configurations BW write MB/s BW read MB/s 2 IO instances and 4 local NVMe drives 2132 2008 2 IO instances and 4 NVMeOF drives 2253 2178 4 IO instances and 8 local NVMe drives 4679 3935 4 IO instances and 8 NVMeOF drives 4248 4268 NVMe drives - Intel P4510 2TB - W: 2 GB/s and R: 3.2 GB/s by specs DAOS: kdev (AIO Linux driver) was used for NVMeOF drives, 2 targets per disk, max pool size, service replica = 1, ofi+sockets provider through Intel Omni-path MPI: np=104 from 3 clients 14
  • 15. Conclusions 15 What have we archived with RSC BasIS Storage Orchestration: • DAOS requires specially-designed hardware platforms to deploy • You can have DAOS cheap - just buy PMEM and compose DAOS over a fabric • Existing servers can share their NVMe drives • DAOS deployment is tricky • Software orchestration significantly simplifies DAOS deployment • DAOS doesn’t fit HPC Cloud • Composable Disaggregated approach gives flexible ways to use PMEM nodes • DAOS can be dynamically assembled when needed RSC Announces DAOS Support in its storage orchestration platform https://www.hpcwire.com/off-the-wire/rsc-announces-intel-ice-lake-sp-and-daos-support-introduces-tornado-afs-storage/