SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
QsNetIII an Adaptively Routed Network for
           High Performance Computing

               Duncan Roweth, Quadrics Ltd
               Hot Interconnects August 2008




28/8/2008                Quadrics Ltd             1
Quadrics Background


• Develops interconnect products for the HPC market
    – HPC Linux systems
    – AlphaServer SC systems
• Quadrics is owned by the Finmeccanica group
• Quadrics was 12 years old in July




 28/8/2008                     Quadrics Ltd           2
QsNet Networks


• Multi-stage switch network
• Components
    –   Adapter: Elan
    –   Router: Elite
    –   Switches, cables
    –   Firmware, drivers, libraries
    –   Diagnostics, documentation
• HPC specific features
    – Adaptive routing
    – Hardware barrier & broadcast




 28/8/2008                         Quadrics Ltd   3
Communication Model
     Virtual Address




                       Processs


28/8/2008                 Quadrics Ltd   4
Quadrics Networks


• Elan1 / Elite1, 1994, Meiko Computing Surface 2
    – Source chooses between pre-defined routes
• Elan3 / Elite3, 2000, first Quadrics product, QsNet
    – First use of packet-by-packet adaptive routing
    – Crosspoint router, x8
• Elan4 / Elite4, 2004, QsNetII
    – Reduced latency, increased bandwidth
    – Increased support for offloading collectives
• Elan5 / Elite5, 2008, QsNetIII
    – General purpose crosspoint router, increased radix, x32
    – Highly programmable adapter




 28/8/2008                        Quadrics Ltd                  5
What is Adaptive Routing ?


• Switch networks typically provide many
  paths between any two points
• In an adaptively routed network
  routers make packet by packet decisions
  on the route to use based on
    –   Queue occupancy
    –   Channel usage
    –   Error rates and state
    –   Class of traffic




 28/8/2008                      Quadrics Ltd   6
Why is Adaptive Routing Important ?


• Most HPC networks are statically routed
    – They use pre-determined paths between nodes
• Static routing can work well
    –   If traffic pattern is known in advance
    –   If traffic pattern is persistent
    –   If traffic pattern is uniform (i.e. application is load balanced)
    –   If there are no errors
• These conditions are not met by real codes on production
  HPC systems {see LLNL and Sandia results}
• Adaptive routing solves these problems
    – Delivering significantly better aggregate bandwidths and worst
      case latencies on real systems running real codes



 28/8/2008                            Quadrics Ltd                          7
Benefits of Adaptive Routing


• Bandwidth achieved
  when 1024 nodes all
  communicate at the
  same time

• Plots show the
  distribution of
  measured bandwidths


       System              Interconnect                      Min                     Max                 Average
       Atlas               Infiniband                         95                     762                     263
                           QsNetII
       Thunder                                               248                     403                     369
              Data from Lawrence Livermore National Lab, published at the Sonoma OpenFabrics workshop April 2007


  28/8/2008                                            Quadrics Ltd                                                8
Benefits of Adaptive Routing


• Classic QsNetII all-to-all bandwidth scaling graph




 28/8/2008                     Quadrics Ltd            9
Ordering Considerations


• Adaptively routed packets can arrive out of order
    – Problems for stream devices, e.g. multipath Ethernet
• Message ordering is required in HPC
    – But within a message we are free to deliver the bulk data in
      arbitrary order
      Get it there as fast as possible then tell me that it is done
• QsNet ordering
    – Packets contain the destination virtual address at which to write
      the data
    – Bulk data transfers can arrive out of order and can be replayed
    – Atomic transactions are sequenced




 28/8/2008                         Quadrics Ltd                           10
Adaptive Routing in QsNetIII


• More flexible than QsNetII
    – Operates over arbitrary sets of links
    – More opportunities to use the technique
    – Higher radix switches


• Select a subset of lightly loaded output ports based on:
    – Destination
    – Link state, errors etc
    – Number of pending acks (programmable threshold)
• Programmable algorithm for selecting from this subset:
    – First free, next free, random




  28/8/2008                           Quadrics Ltd           11
Adaptive Routing: standard case


   – All top switches are equivalent, select one
   – Adaptive routing selects a lightly loaded path




 28/8/2008                        Quadrics Ltd        12
Implementation of Fat Tree Networks


• Connect M×N-way node switches by N×M-way top switches
• In this case M = 16, N = 4




 28/8/2008                 Quadrics Ltd                   13
Adaptive Routing in the Top Switch


• If top switch radix ≤ router radix / 2
    – i.e. 16 for Elite5, 2048-way networks
• Router provides multiple top switches
    – Select which to use based on load
• Example:
    – Traffic from A to B via routers 210 and
      300 is blocked by traffic between 300
      and 200.
    – The router providing 300, 301, 302 and
      303 can select a different path




 28/8/2008                        Quadrics Ltd   14
Adaptive Routing on the Final Hop


• Multiple connections to a node
• Switch can select a free path
• Reduces end-point contention

• Simple case is not optimal
• Spreading the connections
    – Improves fault tolerance
    – Reduces network contention
• Routing decision is made higher
  in the network




 28/8/2008                     Quadrics Ltd   15
Adaptive routing in the presence of errors


•     In a production system with 1000s
      of links it is not uncommon for a
      small number to be broken – until
      the next maintenance slot
•     Adaptive routing minimises the
      impact
•     Example:
       – Link between routers 10 and 20 is
         broken
       – Router 10 dynamically selects paths
         via 21,22,23 spreading the load.
       – Reverse case, avoid sending to 10
         via 20. Reset 20’s links or update
         switches 11,12,13.



    28/8/2008                          Quadrics Ltd   16
Small Packet Support


• Aim to get as close to line rate as possible with small packets
• For example:
    – Small put
    – 32 byte packet




• Adapter has multiple packet engines
• Adapters support up to 64 outstanding packets per link
    – Doubles if we use both links
• Switches provide 32 virtual channels per output link
• Prioritisation – buffering on input to the router

 28/8/2008                       Quadrics Ltd                       17
Barrier & Broadcast Support



• Switches broadcast over
  a range of output links
• Combine Acks / Nacks
• Contiguous in QsNetII
• Sparse in QsNetIII
• Barrier implementation
    – Network conditional
    – Broadcast release




 28/8/2008                  Quadrics Ltd   18
Elan5 – Device Overview


                                                             CX4/              CX4/

 • 2×                                                       QSNetIII          QSNetIII
             QsNetIII    links
        –    20Gbit/s/direction after protocol
                                                                                                                                         Elan5 Adapter
                                                              Link              Link

 • PCIe, PCIe2 host interface
 • Multiple packet engines
                                                          Packet Engine     Packet Engine     Packet Engine        Packet Engine       Packet Engine        Packet Engine       Packet Engine
                                                          16K inst cache    16K inst cache     16K inst cache      16K inst cache       16K inst cache       16K inst cache     16K inst cache
                                                          9K data buffers   9K data buffers    9K data buffers     9K data buffers      9K data buffers      9K data buffers    9K data buffers




 • 512KB of high bandwidth on
                                                                                                                      Fabric
   chip local memory                                                                                      x8


 • SDRAM interface to optional                                                                                                                                                       Bridge
                                                             Host I/F                    Local Memory                      Local Functions
                                                                                                                                                          Object Cache Tags
                                                                TLB

   local memory                                                                                                              Buffer Manager                External cache
                                                             Cmd Launch

                                                                                                                                                      SDRAM i/f       Ext i/f
                                                                                                                                 Free List
                                                                PCIe


 • Buffer manager, object                                                        16K x 8 x 8 banks = 1MB ECC RAM                                                                       PLL
                                                               SERDES




   cache                                                                                                                                              External EEPROM                Clocks
                                                              PCIe

 • Details in ISC Dresden
                                                                                                                                                       DDRII
                                                            16 Lanes



   Paper


 28/8/2008                                       Quadrics Ltd                                                                                                                     19
Elite5 – Device Overview


• 64 × 32 crosspoint router
       – Direct & buffered input from each link
       – 8K of input buffering per link
•     32 virtual channels per link
•     Physical layer DDR XAUI (6.25GHz)
•     Adaptive routing
•     Hardware barrier and broadcast
•     Memory mapped stats & error
      counters accessed out-of-band




    28/8/2008                        Quadrics Ltd   20
QsNetIII Device Overview




                    Elan                           Elite
                             Semi custom ASIC
             Manufacturing partners LSI / TSMC G90 process
                  500 MHz                         312 MHz
                    High performance BGA package
                   672 pin                        982 pin
                   < 17W                          < 18W




 28/8/2008                         Quadrics Ltd              21
QsNetIII Implementation


• Node switch chassis
    – 128 links down to the nodes
    – 128 links up to the top switches
    – Backplane connects 2 sets of cards
• Top switches                                               QsNetIII switch
    – 256 links down to the node switches                    logical design

    – Range of system sizes:
             Ports   Radix   Per Chassis
             512      4          64
                                                           QsNetIII switch
             1024     8          32
                                                          implementation
             2048     16         16
             4096     32         8




 28/8/2008                                 Quadrics Ltd                 22
QsNetIII Network 1024–way


• Fat tree, constructed from 8 × 128-way node switches connected by
  128 × 8-way top switches




 28/8/2008                   Quadrics Ltd                        23
QsNetIII Implementation – Cables


•     QSFP connectors throughout
•     Copper cables (e.g. Gore) 1-10m
•     Active copper cables (e.g. Gore), 8-20m
•     Optical cables (e.g. Luxtera), 5-300m
       – PVDF Plenum rated
       – LSZH available as an option
• No longer Quadrics proprietary

• Likely usage:
       – Short copper cables from nodes
       – Optical cables between switches



    28/8/2008                      Quadrics Ltd   24
QsNetIII Fault Tolerance


• All of the QsNetII Features
    –   CRCs on every packet
    –   Automatic retransmission
    –   Redundant routes
    –   Adaptive routing avoids failed links
    –   Redundant, hot plugable, PSUs and fans


+ Line rate testing of each link as it comes up
    – Switches generate CRPAT, CJPAT or PRBS packets
    – Links are only added to the route tables when they are (a) up, (b)
      connect to the right place, and (c) can transfer data at full line rate
      without error.



 28/8/2008                          Quadrics Ltd                                25
QsNetIII Implementation – HP BladeSystem


Elan5 mezzanine adapter
                                              Elite5 switch module
2 QsNet links, PCI-E x8 Gen2
                                              Full bandwidth
128 MB of memory
                                              16 links to the blades (via backplane)
                                              16 links to back of the module




 28/8/2008                     Quadrics Ltd                                       26
Current Status



• Elite5 silicon in Bristol
• Elan5 at TSMC, first parts expected
  in 3-4 weeks
• Switch PCBs, chassis, backplane,
  controllers are working
• First adapter PCBs are ready
    – PCI-Express x16, HP Blade,
      ExpressModule (Sun Blade)
• We are porting the QsNetII software
• Components at SC08 in Austin
• First customer shipment in Q1 of 2009

 28/8/2008                     Quadrics Ltd   27
Future Work


• QsNetIII hardware
    – Low cost 32-way switch
    – 1024-way single chassis switch


• QsNetIII Software
    – General framework for optimised collectives
    – Support for “multiport” networks - “fat” nodes have multiple
      connections to the same rail
    – Ethernet firmware for the network adapter




 28/8/2008                        Quadrics Ltd                       28
Conclusions


• Adaptive routing underwrites the scalability of HPC systems
  designed to run a single large application
• Adaptive routing has been a feature of QsNet systems since 2000
• QsNetIII offers significant enhancements over both QsNetII and
  competing products




 28/8/2008                   Quadrics Ltd                           29
Thank you for listening




28/8/2008           Quadrics Ltd      30
Additional Material




28/8/2008         Quadrics Ltd    31
Packet Format


• Packet size of up to 4K made up of 256 byte packet segment and
  continuations, 8 byte ACK




 28/8/2008                   Quadrics Ltd                          32
Impact of static routing on latency




 Data from Thunderbird cluster, Sandia National Lab
 Big increases in worst case latency with number of nodes




 28/8/2008                                      Quadrics Ltd   33
Impact of static routing on latency




 Data from Thunderbird cluster, Sandia National Lab
 Big variation in worst case latency across a large job



 28/8/2008                                         Quadrics Ltd   34
Software Model – Firmware & Drivers


• Base firmware in the ROMs
• Firmware modules loadable with the device driver
    – Elan, OpenFabrics, 10GE Ethernet, …
• Kernel modules
    – elan5, elan, rms
• Device dependent library (libelan5)
• Device independent library (libelan)
• User libraries




 28/8/2008                     Quadrics Ltd          35
Software Model – Elan Libraries


• Point-to-point message               • Optimised collectives
  passing                              • Locks and atomics ops
• One-sided put/get                    • Global memory allocation
• Transparent rail striping




 28/8/2008                    Quadrics Ltd                          36
QsNetIII Performance Summary


• Similar latencies to QsNetII
    – The 1.3 to 2 microsecs of latency is mostly in the host PCI and
      memory system
• Higher issue rates
    – Improved link utilisation on small transfers
• Higher bandwidths
    – 1.5 to 2.25 GB/sec/link depending on host interface
• Bi-directional host interface
    – 2 x improvement over QsNetII
• Broadcast and barrier in hardware
• Continued development of adaptive routing underwrites scaling
  to high node counts


 28/8/2008                         Quadrics Ltd                         37

Weitere ähnliche Inhalte

Was ist angesagt?

Building DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPNBuilding DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPNCisco Canada
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switchmicchie
 
Vxlan control plane and routing
Vxlan control plane and routingVxlan control plane and routing
Vxlan control plane and routingWilfredzeng
 
Performance Improved Network on Chip Router for Low Power Applications
Performance Improved Network on Chip Router for Low Power ApplicationsPerformance Improved Network on Chip Router for Low Power Applications
Performance Improved Network on Chip Router for Low Power ApplicationsIJTET Journal
 
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...OpenNebula Project
 
06 evpn use-case_reviewv1
06 evpn use-case_reviewv106 evpn use-case_reviewv1
06 evpn use-case_reviewv1ronsito
 
Demystifying EVPN in the data center: Part 1 in 2 episode series
Demystifying EVPN in the data center: Part 1 in 2 episode seriesDemystifying EVPN in the data center: Part 1 in 2 episode series
Demystifying EVPN in the data center: Part 1 in 2 episode seriesCumulus Networks
 
OTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOMEOTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOMEnetworkershome
 
Can We Emulate Local Circuit Switching in Cloud Storage?
Can We Emulate Local Circuit Switching in Cloud Storage?Can We Emulate Local Circuit Switching in Cloud Storage?
Can We Emulate Local Circuit Switching in Cloud Storage?Tokyo University of Science
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)Kirill Tsym
 
Virtual Network Performance Challenge
Virtual Network Performance ChallengeVirtual Network Performance Challenge
Virtual Network Performance ChallengeStephen Hemminger
 
MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicEric Verhulst
 
Multi-Stage Clos Networks in Router Architecture
Multi-Stage Clos Networks in Router ArchitectureMulti-Stage Clos Networks in Router Architecture
Multi-Stage Clos Networks in Router Architecturelawuah
 
Vxlan deep dive session rev0.5 final
Vxlan deep dive session rev0.5   finalVxlan deep dive session rev0.5   final
Vxlan deep dive session rev0.5 finalKwonSun Bae
 
OTV(Overlay Transport Virtualization)
OTV(Overlay  Transport  Virtualization)OTV(Overlay  Transport  Virtualization)
OTV(Overlay Transport Virtualization)NetProtocol Xpert
 
EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...
EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...
EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...Jeong, Wookjae
 

Was ist angesagt? (20)

Building DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPNBuilding DataCenter networks with VXLAN BGP-EVPN
Building DataCenter networks with VXLAN BGP-EVPN
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switch
 
Vxlan control plane and routing
Vxlan control plane and routingVxlan control plane and routing
Vxlan control plane and routing
 
Performance Improved Network on Chip Router for Low Power Applications
Performance Improved Network on Chip Router for Low Power ApplicationsPerformance Improved Network on Chip Router for Low Power Applications
Performance Improved Network on Chip Router for Low Power Applications
 
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
 
Introduction to vxlan
Introduction to vxlanIntroduction to vxlan
Introduction to vxlan
 
06 evpn use-case_reviewv1
06 evpn use-case_reviewv106 evpn use-case_reviewv1
06 evpn use-case_reviewv1
 
Demystifying EVPN in the data center: Part 1 in 2 episode series
Demystifying EVPN in the data center: Part 1 in 2 episode seriesDemystifying EVPN in the data center: Part 1 in 2 episode series
Demystifying EVPN in the data center: Part 1 in 2 episode series
 
OTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOMEOTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOME
 
Can We Emulate Local Circuit Switching in Cloud Storage?
Can We Emulate Local Circuit Switching in Cloud Storage?Can We Emulate Local Circuit Switching in Cloud Storage?
Can We Emulate Local Circuit Switching in Cloud Storage?
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
Virtual Network Performance Challenge
Virtual Network Performance ChallengeVirtual Network Performance Challenge
Virtual Network Performance Challenge
 
MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 Altreonic
 
Multi-Stage Clos Networks in Router Architecture
Multi-Stage Clos Networks in Router ArchitectureMulti-Stage Clos Networks in Router Architecture
Multi-Stage Clos Networks in Router Architecture
 
Vxlan deep dive session rev0.5 final
Vxlan deep dive session rev0.5   finalVxlan deep dive session rev0.5   final
Vxlan deep dive session rev0.5 final
 
Virtual net performance
Virtual net performanceVirtual net performance
Virtual net performance
 
OTV(Overlay Transport Virtualization)
OTV(Overlay  Transport  Virtualization)OTV(Overlay  Transport  Virtualization)
OTV(Overlay Transport Virtualization)
 
CAN Bus
CAN BusCAN Bus
CAN Bus
 
EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...
EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...
EYWA (Elastic load-balancing & high-availabilitY Wired virtual network Archit...
 
Cisco OTV 
Cisco OTV Cisco OTV 
Cisco OTV 
 

Ähnlich wie QsNetIII Adaptively Routed Network For HPC

Ocpeu14
Ocpeu14Ocpeu14
Ocpeu14KALRAY
 
The Evolving Internet Fndtn
The Evolving Internet FndtnThe Evolving Internet Fndtn
The Evolving Internet Fndtnguestbf78f8b
 
Cloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxCloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxRahulBhole12
 
22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...
22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...
22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...Indonesia Network Operators Group
 
Introducing the Future of Data Center Interconnect Networks
Introducing the Future of Data Center Interconnect NetworksIntroducing the Future of Data Center Interconnect Networks
Introducing the Future of Data Center Interconnect NetworksADVA
 
A Survey on Wireless Mesh Networks (WMN)
A Survey on Wireless Mesh Networks (WMN)A Survey on Wireless Mesh Networks (WMN)
A Survey on Wireless Mesh Networks (WMN)Eyob Sisay
 
Chapter 4 internetworking [compatibility mode]
Chapter 4   internetworking [compatibility mode]Chapter 4   internetworking [compatibility mode]
Chapter 4 internetworking [compatibility mode]Sĩ Anh Nguyễn
 
A Whole Lot of Ports: Juniper Networks QFabric System Assessment
A Whole Lot of Ports: Juniper Networks QFabric System AssessmentA Whole Lot of Ports: Juniper Networks QFabric System Assessment
A Whole Lot of Ports: Juniper Networks QFabric System AssessmentJuniper Networks
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specificationsinside-BigData.com
 
Ethernetv infiniband
Ethernetv infinibandEthernetv infiniband
Ethernetv infinibandMason Mei
 
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:Tony Antony
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
Cloud Networking is not Virtual Networking - London VMUG 20130425
Cloud Networking is not Virtual Networking - London VMUG 20130425Cloud Networking is not Virtual Networking - London VMUG 20130425
Cloud Networking is not Virtual Networking - London VMUG 20130425Greg Ferro
 
LAN Demo
LAN DemoLAN Demo
LAN Demoalcsoft
 

Ähnlich wie QsNetIII Adaptively Routed Network For HPC (20)

XS Boston 2008 Network Topology
XS Boston 2008 Network TopologyXS Boston 2008 Network Topology
XS Boston 2008 Network Topology
 
Ocpeu14
Ocpeu14Ocpeu14
Ocpeu14
 
The Evolving Internet Fndtn
The Evolving Internet FndtnThe Evolving Internet Fndtn
The Evolving Internet Fndtn
 
Cloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptxCloud interconnection networks basic .pptx
Cloud interconnection networks basic .pptx
 
Ccna 2 chapter 1 2014 v5
Ccna 2 chapter 1 2014 v5Ccna 2 chapter 1 2014 v5
Ccna 2 chapter 1 2014 v5
 
22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...
22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...
22 - IDNOG03 - Christopher Lim (Mellanox) - Efficient Virtual Network for Ser...
 
Network on Chip
Network on ChipNetwork on Chip
Network on Chip
 
Introducing the Future of Data Center Interconnect Networks
Introducing the Future of Data Center Interconnect NetworksIntroducing the Future of Data Center Interconnect Networks
Introducing the Future of Data Center Interconnect Networks
 
Gaurab Ixp Tutorial
Gaurab Ixp TutorialGaurab Ixp Tutorial
Gaurab Ixp Tutorial
 
To Infiniband and Beyond
To Infiniband and BeyondTo Infiniband and Beyond
To Infiniband and Beyond
 
A Survey on Wireless Mesh Networks (WMN)
A Survey on Wireless Mesh Networks (WMN)A Survey on Wireless Mesh Networks (WMN)
A Survey on Wireless Mesh Networks (WMN)
 
10 sdn-vir-6up
10 sdn-vir-6up10 sdn-vir-6up
10 sdn-vir-6up
 
Chapter 4 internetworking [compatibility mode]
Chapter 4   internetworking [compatibility mode]Chapter 4   internetworking [compatibility mode]
Chapter 4 internetworking [compatibility mode]
 
A Whole Lot of Ports: Juniper Networks QFabric System Assessment
A Whole Lot of Ports: Juniper Networks QFabric System AssessmentA Whole Lot of Ports: Juniper Networks QFabric System Assessment
A Whole Lot of Ports: Juniper Networks QFabric System Assessment
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specifications
 
Ethernetv infiniband
Ethernetv infinibandEthernetv infiniband
Ethernetv infiniband
 
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Cloud Networking is not Virtual Networking - London VMUG 20130425
Cloud Networking is not Virtual Networking - London VMUG 20130425Cloud Networking is not Virtual Networking - London VMUG 20130425
Cloud Networking is not Virtual Networking - London VMUG 20130425
 
LAN Demo
LAN DemoLAN Demo
LAN Demo
 

Kürzlich hochgeladen

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Kürzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

QsNetIII Adaptively Routed Network For HPC

  • 1. QsNetIII an Adaptively Routed Network for High Performance Computing Duncan Roweth, Quadrics Ltd Hot Interconnects August 2008 28/8/2008 Quadrics Ltd 1
  • 2. Quadrics Background • Develops interconnect products for the HPC market – HPC Linux systems – AlphaServer SC systems • Quadrics is owned by the Finmeccanica group • Quadrics was 12 years old in July 28/8/2008 Quadrics Ltd 2
  • 3. QsNet Networks • Multi-stage switch network • Components – Adapter: Elan – Router: Elite – Switches, cables – Firmware, drivers, libraries – Diagnostics, documentation • HPC specific features – Adaptive routing – Hardware barrier & broadcast 28/8/2008 Quadrics Ltd 3
  • 4. Communication Model Virtual Address Processs 28/8/2008 Quadrics Ltd 4
  • 5. Quadrics Networks • Elan1 / Elite1, 1994, Meiko Computing Surface 2 – Source chooses between pre-defined routes • Elan3 / Elite3, 2000, first Quadrics product, QsNet – First use of packet-by-packet adaptive routing – Crosspoint router, x8 • Elan4 / Elite4, 2004, QsNetII – Reduced latency, increased bandwidth – Increased support for offloading collectives • Elan5 / Elite5, 2008, QsNetIII – General purpose crosspoint router, increased radix, x32 – Highly programmable adapter 28/8/2008 Quadrics Ltd 5
  • 6. What is Adaptive Routing ? • Switch networks typically provide many paths between any two points • In an adaptively routed network routers make packet by packet decisions on the route to use based on – Queue occupancy – Channel usage – Error rates and state – Class of traffic 28/8/2008 Quadrics Ltd 6
  • 7. Why is Adaptive Routing Important ? • Most HPC networks are statically routed – They use pre-determined paths between nodes • Static routing can work well – If traffic pattern is known in advance – If traffic pattern is persistent – If traffic pattern is uniform (i.e. application is load balanced) – If there are no errors • These conditions are not met by real codes on production HPC systems {see LLNL and Sandia results} • Adaptive routing solves these problems – Delivering significantly better aggregate bandwidths and worst case latencies on real systems running real codes 28/8/2008 Quadrics Ltd 7
  • 8. Benefits of Adaptive Routing • Bandwidth achieved when 1024 nodes all communicate at the same time • Plots show the distribution of measured bandwidths System Interconnect Min Max Average Atlas Infiniband 95 762 263 QsNetII Thunder 248 403 369 Data from Lawrence Livermore National Lab, published at the Sonoma OpenFabrics workshop April 2007 28/8/2008 Quadrics Ltd 8
  • 9. Benefits of Adaptive Routing • Classic QsNetII all-to-all bandwidth scaling graph 28/8/2008 Quadrics Ltd 9
  • 10. Ordering Considerations • Adaptively routed packets can arrive out of order – Problems for stream devices, e.g. multipath Ethernet • Message ordering is required in HPC – But within a message we are free to deliver the bulk data in arbitrary order Get it there as fast as possible then tell me that it is done • QsNet ordering – Packets contain the destination virtual address at which to write the data – Bulk data transfers can arrive out of order and can be replayed – Atomic transactions are sequenced 28/8/2008 Quadrics Ltd 10
  • 11. Adaptive Routing in QsNetIII • More flexible than QsNetII – Operates over arbitrary sets of links – More opportunities to use the technique – Higher radix switches • Select a subset of lightly loaded output ports based on: – Destination – Link state, errors etc – Number of pending acks (programmable threshold) • Programmable algorithm for selecting from this subset: – First free, next free, random 28/8/2008 Quadrics Ltd 11
  • 12. Adaptive Routing: standard case – All top switches are equivalent, select one – Adaptive routing selects a lightly loaded path 28/8/2008 Quadrics Ltd 12
  • 13. Implementation of Fat Tree Networks • Connect M×N-way node switches by N×M-way top switches • In this case M = 16, N = 4 28/8/2008 Quadrics Ltd 13
  • 14. Adaptive Routing in the Top Switch • If top switch radix ≤ router radix / 2 – i.e. 16 for Elite5, 2048-way networks • Router provides multiple top switches – Select which to use based on load • Example: – Traffic from A to B via routers 210 and 300 is blocked by traffic between 300 and 200. – The router providing 300, 301, 302 and 303 can select a different path 28/8/2008 Quadrics Ltd 14
  • 15. Adaptive Routing on the Final Hop • Multiple connections to a node • Switch can select a free path • Reduces end-point contention • Simple case is not optimal • Spreading the connections – Improves fault tolerance – Reduces network contention • Routing decision is made higher in the network 28/8/2008 Quadrics Ltd 15
  • 16. Adaptive routing in the presence of errors • In a production system with 1000s of links it is not uncommon for a small number to be broken – until the next maintenance slot • Adaptive routing minimises the impact • Example: – Link between routers 10 and 20 is broken – Router 10 dynamically selects paths via 21,22,23 spreading the load. – Reverse case, avoid sending to 10 via 20. Reset 20’s links or update switches 11,12,13. 28/8/2008 Quadrics Ltd 16
  • 17. Small Packet Support • Aim to get as close to line rate as possible with small packets • For example: – Small put – 32 byte packet • Adapter has multiple packet engines • Adapters support up to 64 outstanding packets per link – Doubles if we use both links • Switches provide 32 virtual channels per output link • Prioritisation – buffering on input to the router 28/8/2008 Quadrics Ltd 17
  • 18. Barrier & Broadcast Support • Switches broadcast over a range of output links • Combine Acks / Nacks • Contiguous in QsNetII • Sparse in QsNetIII • Barrier implementation – Network conditional – Broadcast release 28/8/2008 Quadrics Ltd 18
  • 19. Elan5 – Device Overview CX4/ CX4/ • 2× QSNetIII QSNetIII QsNetIII links – 20Gbit/s/direction after protocol Elan5 Adapter Link Link • PCIe, PCIe2 host interface • Multiple packet engines Packet Engine Packet Engine Packet Engine Packet Engine Packet Engine Packet Engine Packet Engine 16K inst cache 16K inst cache 16K inst cache 16K inst cache 16K inst cache 16K inst cache 16K inst cache 9K data buffers 9K data buffers 9K data buffers 9K data buffers 9K data buffers 9K data buffers 9K data buffers • 512KB of high bandwidth on Fabric chip local memory x8 • SDRAM interface to optional Bridge Host I/F Local Memory Local Functions Object Cache Tags TLB local memory Buffer Manager External cache Cmd Launch SDRAM i/f Ext i/f Free List PCIe • Buffer manager, object 16K x 8 x 8 banks = 1MB ECC RAM PLL SERDES cache External EEPROM Clocks PCIe • Details in ISC Dresden DDRII 16 Lanes Paper 28/8/2008 Quadrics Ltd 19
  • 20. Elite5 – Device Overview • 64 × 32 crosspoint router – Direct & buffered input from each link – 8K of input buffering per link • 32 virtual channels per link • Physical layer DDR XAUI (6.25GHz) • Adaptive routing • Hardware barrier and broadcast • Memory mapped stats & error counters accessed out-of-band 28/8/2008 Quadrics Ltd 20
  • 21. QsNetIII Device Overview Elan Elite Semi custom ASIC Manufacturing partners LSI / TSMC G90 process 500 MHz 312 MHz High performance BGA package 672 pin 982 pin < 17W < 18W 28/8/2008 Quadrics Ltd 21
  • 22. QsNetIII Implementation • Node switch chassis – 128 links down to the nodes – 128 links up to the top switches – Backplane connects 2 sets of cards • Top switches QsNetIII switch – 256 links down to the node switches logical design – Range of system sizes: Ports Radix Per Chassis 512 4 64 QsNetIII switch 1024 8 32 implementation 2048 16 16 4096 32 8 28/8/2008 Quadrics Ltd 22
  • 23. QsNetIII Network 1024–way • Fat tree, constructed from 8 × 128-way node switches connected by 128 × 8-way top switches 28/8/2008 Quadrics Ltd 23
  • 24. QsNetIII Implementation – Cables • QSFP connectors throughout • Copper cables (e.g. Gore) 1-10m • Active copper cables (e.g. Gore), 8-20m • Optical cables (e.g. Luxtera), 5-300m – PVDF Plenum rated – LSZH available as an option • No longer Quadrics proprietary • Likely usage: – Short copper cables from nodes – Optical cables between switches 28/8/2008 Quadrics Ltd 24
  • 25. QsNetIII Fault Tolerance • All of the QsNetII Features – CRCs on every packet – Automatic retransmission – Redundant routes – Adaptive routing avoids failed links – Redundant, hot plugable, PSUs and fans + Line rate testing of each link as it comes up – Switches generate CRPAT, CJPAT or PRBS packets – Links are only added to the route tables when they are (a) up, (b) connect to the right place, and (c) can transfer data at full line rate without error. 28/8/2008 Quadrics Ltd 25
  • 26. QsNetIII Implementation – HP BladeSystem Elan5 mezzanine adapter Elite5 switch module 2 QsNet links, PCI-E x8 Gen2 Full bandwidth 128 MB of memory 16 links to the blades (via backplane) 16 links to back of the module 28/8/2008 Quadrics Ltd 26
  • 27. Current Status • Elite5 silicon in Bristol • Elan5 at TSMC, first parts expected in 3-4 weeks • Switch PCBs, chassis, backplane, controllers are working • First adapter PCBs are ready – PCI-Express x16, HP Blade, ExpressModule (Sun Blade) • We are porting the QsNetII software • Components at SC08 in Austin • First customer shipment in Q1 of 2009 28/8/2008 Quadrics Ltd 27
  • 28. Future Work • QsNetIII hardware – Low cost 32-way switch – 1024-way single chassis switch • QsNetIII Software – General framework for optimised collectives – Support for “multiport” networks - “fat” nodes have multiple connections to the same rail – Ethernet firmware for the network adapter 28/8/2008 Quadrics Ltd 28
  • 29. Conclusions • Adaptive routing underwrites the scalability of HPC systems designed to run a single large application • Adaptive routing has been a feature of QsNet systems since 2000 • QsNetIII offers significant enhancements over both QsNetII and competing products 28/8/2008 Quadrics Ltd 29
  • 30. Thank you for listening 28/8/2008 Quadrics Ltd 30
  • 31. Additional Material 28/8/2008 Quadrics Ltd 31
  • 32. Packet Format • Packet size of up to 4K made up of 256 byte packet segment and continuations, 8 byte ACK 28/8/2008 Quadrics Ltd 32
  • 33. Impact of static routing on latency Data from Thunderbird cluster, Sandia National Lab Big increases in worst case latency with number of nodes 28/8/2008 Quadrics Ltd 33
  • 34. Impact of static routing on latency Data from Thunderbird cluster, Sandia National Lab Big variation in worst case latency across a large job 28/8/2008 Quadrics Ltd 34
  • 35. Software Model – Firmware & Drivers • Base firmware in the ROMs • Firmware modules loadable with the device driver – Elan, OpenFabrics, 10GE Ethernet, … • Kernel modules – elan5, elan, rms • Device dependent library (libelan5) • Device independent library (libelan) • User libraries 28/8/2008 Quadrics Ltd 35
  • 36. Software Model – Elan Libraries • Point-to-point message • Optimised collectives passing • Locks and atomics ops • One-sided put/get • Global memory allocation • Transparent rail striping 28/8/2008 Quadrics Ltd 36
  • 37. QsNetIII Performance Summary • Similar latencies to QsNetII – The 1.3 to 2 microsecs of latency is mostly in the host PCI and memory system • Higher issue rates – Improved link utilisation on small transfers • Higher bandwidths – 1.5 to 2.25 GB/sec/link depending on host interface • Bi-directional host interface – 2 x improvement over QsNetII • Broadcast and barrier in hardware • Continued development of adaptive routing underwrites scaling to high node counts 28/8/2008 Quadrics Ltd 37