SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
HPE’S DAOS SOLUTION PLANS
Lance Evans
Chief Storage Architect
November 19, 2020
2
RESEARCH PHASE (WRAPPED UP C2020Q3)
Prepare for emerging storage hardware, with HPC-capable data path software
Evaluate DAOS architecture, code quality, maturity, development team, processes
Consider related IP that competes/complements/contributes
Recommend a path to productization for next-gen HPC data infrastructure
Outcome: Proceed with solution prototyping and technology transfer to Storage R&D
• Middleware
• Adapts core services to specific use cases
• No common baggage that all apps must endure
• All user address space, all open-source
• Traditional HPC apps
• Large shared file with no interference
• File-per-process on shared devices at speed
• Simulation checkpoints – burst buffer
• Non-traditional HPC apps
• Machine Learning training and inference
• Scaled high-level languages
• Multi-tenant workloads
• Potential apps
• Graph analytics
• Database
• Streams processing
• Computational storage
3
DAOS SOLUTION USE CASES
DAOS Storage Engine
NVMe SSD HW
Bulk Block Data
Persistent Memory HW
Metadata & Data <4KiB
DAOS Storage Engine
NVMe SSD HW
Bulk Block Data
Persistent Memory HW
Metadata & Data <4KiB
DAOS Storage Engine
NVMe SSD HW
Bulk Block Data
Persistent Memory HW
Metadata & Data <4KiB
Horizontally Scalable DAOS Storage Engine
NVMe SSD HW
Bulk Block Data 4KiB+
Persistent Memory HW
Metadata & Data <4KiB
AI/Analytics/Database/Simulation Application
DAOS Client Library With Sharding & Erasure Coding
POSIX MPI HDF5 Python Spark Etc
AI/Analytics/Database/Simulation Application
DAOS Client Library With Sharding & Erasure Coding
POSIX MPI HDF5 Python Spark Etc
AI/Analytics/Database/Simulation Application
DAOS Client Library With Sharding & Erasure Coding
POSIX MPI HDF5 Python Spark Etc
App: AI/Analytics/Database/Simulation
DAOS Client Library With Sharding & Erasure Coding
POSIX MPIIO HDF5 Python Spark Etc
Min 6%, max 100%
One To Thousands
One To Tens Of Thousands
MySQL?
TensorFlow?
IB, Slingshot
(libfabric-based)
SW Latency ~20us
4
REFERENCE IMPLEMENTATION GOALS
Harden DAOS data path software, targeting version 2.0
Bundle with released HPE hardware and customized Cray management software
Enable initial customer deployments with repeatability and low risk
Prepare HPE for productization path within the Cray ClusterStor product line
5
REFERENCE IMPLEMENTATION COMPONENTS
HPE PROLIANT DAOS SERVERS
INTEL
ICE LAKE,
DCPMM,
NVME
HPE PROLIANT MANAGEMENT SERVERS
HPE ARUBA MANAGMEMENT SWITCHES
MELLANOX 200GB INFINIBAND
HPE SLINGSHOT 200GB ETHERNET
• Single-Rack Solution with Maximums:
• 16x Servers; 128TB DCPMM, 2PB flash
• 4x 200Gb Switches (Mellanox or HPE Slingshot)
• ~1,400GBps/700GBps est raw read/write throughput
• ~100M/50M est raw read/write OPS
• Unbundled Repeatable Solution Delivery Method
• Qualified hardware and software BOM
• Installation / configuration framework and scripting
• Reference doc set for field or factory integration
• Customer system administration skills needed
• Availability
• Seeking early adopters for package trials C2021Q3
• Individual elements sold and supported individually
• DAOS, solution scripts, docs via engineering collaboration
• Full HPE DAOS product pacing is TBD
6
REFERENCE IMPLEMENTATION CHARACTERISTICS
4x 200Gb Switches (optional):
• HPE Slingshot (80 uplinks max)
• Mellanox QM8700 (72 uplinks max)
16x HPE DAOS Servers:
• 2-Socket Ice Lake
• Gen4 NVMe SSD up to 128TB
• Optane Memory 8TiB
3x HPE Proliant Management Servers
2x 1/10GbE Aruba Mgmt Switches
Max
41RU
• Our results so far lead us to believe DAOS is a viable data path for future HPC Storage designs.
• HPE’s HPC division (formerly Cray) will participate with Intel in delivering the Aurora supercomputer to
Argonne National Labs, one of the largest computers in the world, and containing over 200PB of DAOS.
• HPE’s HPC CTO lab will host a small number of potential customers who want to run POCs in our
testbed.
• We will collaborate with early adopters, integrating working DAOS-based storage solutions in their data
centers, collecting product requirements and feedback.
• Assuming this trial phase is successful, we intend to proceed toward full productization of DAOS in the
Sapphire Rapids timeframe.
• Combined with our scalable HPC system management, HPE server designs, and Slingshot networking,
HPE will continue to offer the most scalable and performant storage for the world’s largest computers.
7
WHAT’S NEXT?
CONTACT:
lance.evans@hpe.com

Weitere ähnliche Inhalte

Was ist angesagt?

Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test ResultsUncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
DataWorks Summit
 

Was ist angesagt? (20)

Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
 
Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez
 
Enterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on HadoopEnterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on Hadoop
 
Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges Big data processing meets non-volatile memory: opportunities and challenges
Big data processing meets non-volatile memory: opportunities and challenges
 
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test ResultsUncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
Solving Real Problems with Apache Spark: Archiving, E-Discovery, and Supervis...
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Tame that Beast
Tame that BeastTame that Beast
Tame that Beast
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
HBaseConAsia2018 Track1-5: Improving HBase reliability at PInterest with geo ...
 
Building Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and KafkaBuilding Big Data Applications using Spark, Hive, HBase and Kafka
Building Big Data Applications using Spark, Hive, HBase and Kafka
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data Insights
 
Securing data in hybrid environments using Apache Ranger
Securing data in hybrid environments using Apache RangerSecuring data in hybrid environments using Apache Ranger
Securing data in hybrid environments using Apache Ranger
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
Scaling HDFS at Xiaomi
Scaling HDFS at XiaomiScaling HDFS at Xiaomi
Scaling HDFS at Xiaomi
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
 

Ähnlich wie DUG'20: 13 - HPE’s DAOS Solution Plans

Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 
HDFS presented by VIJAY
HDFS presented by VIJAYHDFS presented by VIJAY
HDFS presented by VIJAY
thevijayps
 
Red hat storage el almacenamiento disruptivo
Red hat storage el almacenamiento disruptivoRed hat storage el almacenamiento disruptivo
Red hat storage el almacenamiento disruptivo
Nextel S.A.
 

Ähnlich wie DUG'20: 13 - HPE’s DAOS Solution Plans (20)

Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
 
HDFS presented by VIJAY
HDFS presented by VIJAYHDFS presented by VIJAY
HDFS presented by VIJAY
 
Red hat storage el almacenamiento disruptivo
Red hat storage el almacenamiento disruptivoRed hat storage el almacenamiento disruptivo
Red hat storage el almacenamiento disruptivo
 
The IBM Data Engine for NoSQL on IBM Power Systems™
The IBM Data Engine for NoSQL on IBM Power Systems™The IBM Data Engine for NoSQL on IBM Power Systems™
The IBM Data Engine for NoSQL on IBM Power Systems™
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
TechTalkThai webinar SAP HANA
TechTalkThai webinar SAP HANATechTalkThai webinar SAP HANA
TechTalkThai webinar SAP HANA
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 

Mehr von Andrey Kudryavtsev

Mehr von Andrey Kudryavtsev (13)

DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation CenterDUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
 
DUG'20: 10 - Storage Orchestration for Composable Storage Architectures
DUG'20: 10 - Storage Orchestration for Composable Storage ArchitecturesDUG'20: 10 - Storage Orchestration for Composable Storage Architectures
DUG'20: 10 - Storage Orchestration for Composable Storage Architectures
 
DUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware UpdateDUG'20: 09 - DAOS Middleware Update
DUG'20: 09 - DAOS Middleware Update
 
DUG'20: 08 - DAOS-SEGY Mapping
DUG'20: 08 - DAOS-SEGY MappingDUG'20: 08 - DAOS-SEGY Mapping
DUG'20: 08 - DAOS-SEGY Mapping
 
DUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOSDUG'20: 07 - Storing High-Energy Physics data in DAOS
DUG'20: 07 - Storing High-Energy Physics data in DAOS
 
DUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN OpenlabDUG'20: 06 - DAOS Adventures at CERN Openlab
DUG'20: 06 - DAOS Adventures at CERN Openlab
 
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS TestbedDUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
 
DUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature UpdateDUG'20: 04 - DAOS Feature Update
DUG'20: 04 - DAOS Feature Update
 
DUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOSDUG'20: 03 - Online compression with QAT in DAOS
DUG'20: 03 - Online compression with QAT in DAOS
 
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraDUG'20: 02 - Accelerating apache spark with DAOS on Aurora
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS Update
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

DUG'20: 13 - HPE’s DAOS Solution Plans

  • 1. HPE’S DAOS SOLUTION PLANS Lance Evans Chief Storage Architect November 19, 2020
  • 2. 2 RESEARCH PHASE (WRAPPED UP C2020Q3) Prepare for emerging storage hardware, with HPC-capable data path software Evaluate DAOS architecture, code quality, maturity, development team, processes Consider related IP that competes/complements/contributes Recommend a path to productization for next-gen HPC data infrastructure Outcome: Proceed with solution prototyping and technology transfer to Storage R&D
  • 3. • Middleware • Adapts core services to specific use cases • No common baggage that all apps must endure • All user address space, all open-source • Traditional HPC apps • Large shared file with no interference • File-per-process on shared devices at speed • Simulation checkpoints – burst buffer • Non-traditional HPC apps • Machine Learning training and inference • Scaled high-level languages • Multi-tenant workloads • Potential apps • Graph analytics • Database • Streams processing • Computational storage 3 DAOS SOLUTION USE CASES DAOS Storage Engine NVMe SSD HW Bulk Block Data Persistent Memory HW Metadata & Data <4KiB DAOS Storage Engine NVMe SSD HW Bulk Block Data Persistent Memory HW Metadata & Data <4KiB DAOS Storage Engine NVMe SSD HW Bulk Block Data Persistent Memory HW Metadata & Data <4KiB Horizontally Scalable DAOS Storage Engine NVMe SSD HW Bulk Block Data 4KiB+ Persistent Memory HW Metadata & Data <4KiB AI/Analytics/Database/Simulation Application DAOS Client Library With Sharding & Erasure Coding POSIX MPI HDF5 Python Spark Etc AI/Analytics/Database/Simulation Application DAOS Client Library With Sharding & Erasure Coding POSIX MPI HDF5 Python Spark Etc AI/Analytics/Database/Simulation Application DAOS Client Library With Sharding & Erasure Coding POSIX MPI HDF5 Python Spark Etc App: AI/Analytics/Database/Simulation DAOS Client Library With Sharding & Erasure Coding POSIX MPIIO HDF5 Python Spark Etc Min 6%, max 100% One To Thousands One To Tens Of Thousands MySQL? TensorFlow? IB, Slingshot (libfabric-based) SW Latency ~20us
  • 4. 4 REFERENCE IMPLEMENTATION GOALS Harden DAOS data path software, targeting version 2.0 Bundle with released HPE hardware and customized Cray management software Enable initial customer deployments with repeatability and low risk Prepare HPE for productization path within the Cray ClusterStor product line
  • 5. 5 REFERENCE IMPLEMENTATION COMPONENTS HPE PROLIANT DAOS SERVERS INTEL ICE LAKE, DCPMM, NVME HPE PROLIANT MANAGEMENT SERVERS HPE ARUBA MANAGMEMENT SWITCHES MELLANOX 200GB INFINIBAND HPE SLINGSHOT 200GB ETHERNET
  • 6. • Single-Rack Solution with Maximums: • 16x Servers; 128TB DCPMM, 2PB flash • 4x 200Gb Switches (Mellanox or HPE Slingshot) • ~1,400GBps/700GBps est raw read/write throughput • ~100M/50M est raw read/write OPS • Unbundled Repeatable Solution Delivery Method • Qualified hardware and software BOM • Installation / configuration framework and scripting • Reference doc set for field or factory integration • Customer system administration skills needed • Availability • Seeking early adopters for package trials C2021Q3 • Individual elements sold and supported individually • DAOS, solution scripts, docs via engineering collaboration • Full HPE DAOS product pacing is TBD 6 REFERENCE IMPLEMENTATION CHARACTERISTICS 4x 200Gb Switches (optional): • HPE Slingshot (80 uplinks max) • Mellanox QM8700 (72 uplinks max) 16x HPE DAOS Servers: • 2-Socket Ice Lake • Gen4 NVMe SSD up to 128TB • Optane Memory 8TiB 3x HPE Proliant Management Servers 2x 1/10GbE Aruba Mgmt Switches Max 41RU
  • 7. • Our results so far lead us to believe DAOS is a viable data path for future HPC Storage designs. • HPE’s HPC division (formerly Cray) will participate with Intel in delivering the Aurora supercomputer to Argonne National Labs, one of the largest computers in the world, and containing over 200PB of DAOS. • HPE’s HPC CTO lab will host a small number of potential customers who want to run POCs in our testbed. • We will collaborate with early adopters, integrating working DAOS-based storage solutions in their data centers, collecting product requirements and feedback. • Assuming this trial phase is successful, we intend to proceed toward full productization of DAOS in the Sapphire Rapids timeframe. • Combined with our scalable HPC system management, HPE server designs, and Slingshot networking, HPE will continue to offer the most scalable and performant storage for the world’s largest computers. 7 WHAT’S NEXT?