SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Ioannis.Charalampidis@cern.ch
Lazaros.Lazaridis@cern.ch
CERN, June 2016
libFabric
ofi://
Transport
nanomsg
ALFA
usNIC
FairMQ
For the ALICE O2 upgrade, the simulation and
reconstruction software for the ALICE
experiment are using the ALFA1 framework.
Among other abstractions, ALFA/FairROOT
framework provides a Message-Queue
abstraction library, the FairMQ2, that is
lightweight wrapper around ØMQ and NanoMsg
libraries.
Since the project’s goal is to implement a new
transport for FairMQ we decided to extend the
functionality of either of these libraries.
We chose NanoMsg3 because of it’s
clean and modular internals.
Introduc)on	
ØMQ
fi_send(
endpoint,
buffer, len,
mr_desc,
context
);
buffer	…	 ..	
fi_recv(
endpoint,
buffer, len,
mr_desc,
context
);
buffer	…	 ..	
RDMA*	
Memory	Region	 Memory	Region	
fi_send(
endpoint,
buffer, len,
mr_desc,
context
);
fi_recv(
endpoint,
buffer, len,
mr_desc,
context
);
Tx	CQ	 Rx	CQ	 Tx	CQ	 Rx	CQ	
SEND	
ACK	
fi_cq_read( &event ); fi_cq_read( &event );
*	libfabric	has	custom	event	polling	func?ons	
One of the powerful features of the usNIC
fabric is the fact that it can bypass the linux
kernel from user-space when using the
libFabric4 library.
This relieves the kernel from the IP stack
overhead, reclaiming it’s CPU time for more
useful operations.
usNIC	+	Kernel	Bypass	
The	ofi://	Transport	
The project is implemented as a patch to the
NanoMsg sources5 that introduces the Open
Fabrics Interface (OFI) transport.
The transport translates the POSIX-like API of
NanoMsg into an RDMA-like API for libFabric,
transparently to the user. To achieve this it
uses a dynamic memory registration (MR)
mechanism that tries to reduce the amount of
MR performed, while being agnostic of the
user’s intentions.
1.  Technical Design Report for the Upgrade of the Online–
Offline Computing System, The ALICE Collaboration
2.  https://github.com/FairRootGroup/FairRoot/tree/master/
fairmq
3.  http://nanomsg.org
4.  http://ofiwg.github.io/libfabric
5.  https://github.com/wavesoft/nanomsg-transport-ofi
6.  https://github.com/wavesoft/robob
7.  https://github.com/ofiwg/libfabric/issues?q=author
%3Awavesoft
8.  https://github.com/nanomsg/nanomsg/pull/612
In a similar manner, it uses the high-level
libFabric polling API, instead of the FD - based
NanoMsg polling API, making it possible to
support any fabric without any modification.
True	Zero-Copy	
One of the implementation requirements of the
OFI transport was to ensure that no memcpy
operations will take place between the user’s
request and the transfer on the wire.
That’s reasonable if you consider
that the message sizes vary
from 50Mb to 1Gb
A useful by-product of this project was the
development of robob6, a fully automated benchmarking
utility, for ensuring the quality of the measured values
By-Products 	
Outcomes	
We frequently encountered roadblocks while
working this project, since we were using
new products and open source components.
We had frequent interactions with NanoMsg
and CISCO developers7 and we contributed
our own modifications8.
Nonetheless we managed to create a
prototype were we demonstrated the
feasibility of the transport and it’s
performance.
On the right we present some preliminary
measurements using the OFI transport
between two Intel Xeon E5-2690 machines
with UCSC-PCIE-C40Q NICs, connected
through switch with a 40Gbit copper cable. 0	
5	
10	
15	
20	
25	
30	
35	
8192	 16384	 32768	 65536	 1048576	 2097152	 4194304	 8388608	 16777216	 33554432	 67108864	 134217728	
Throughput	(GB/s)	for	Different	Message	Sizes	
OFI	[GBit/s]	 TCP	[Gbit/s]	 ØMQ	[Gbit/s]	
www.cern.ch/openlab
Poster by Ioannis Charalampidis. Special thanks
to our supervisor, Predrag Buncic, to Artur
Barczyk for his guidance, to Mohammad Al-
Turany and Peter Hristov

Weitere ähnliche Inhalte

Was ist angesagt?

Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba...
 Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba... Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba...
Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba...Semmle
 
Involvement in OpenHPC
Involvement in OpenHPC	Involvement in OpenHPC
Involvement in OpenHPC Linaro
 
Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer George Markomanolis
 
OpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisOpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisDmitry Baryshkov
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem	Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem Linaro
 
Open cl programming using python syntax
Open cl programming using python syntaxOpen cl programming using python syntax
Open cl programming using python syntaxcsandit
 
Kernel Recipes 2014 - kGraft: Live Patching of the Linux Kernel
Kernel Recipes 2014 - kGraft: Live Patching of the Linux KernelKernel Recipes 2014 - kGraft: Live Patching of the Linux Kernel
Kernel Recipes 2014 - kGraft: Live Patching of the Linux KernelAnne Nicolas
 
On the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC ApplicationsOn the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC ApplicationsWim Vanderbauwhede
 
Segment Routing v6 (SRv6) Academy Update
Segment Routing v6 (SRv6) Academy Update Segment Routing v6 (SRv6) Academy Update
Segment Routing v6 (SRv6) Academy Update Chunghan Lee
 
BPF - All your packets belong to me
BPF - All your packets belong to meBPF - All your packets belong to me
BPF - All your packets belong to me_xhr_
 
Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFJeff Squyres
 
Compiling P4 to XDP, IOVISOR Summit 2017
Compiling P4 to XDP, IOVISOR Summit 2017Compiling P4 to XDP, IOVISOR Summit 2017
Compiling P4 to XDP, IOVISOR Summit 2017Cheng-Chun William Tu
 
Towards ruby-3x3-performance
Towards ruby-3x3-performanceTowards ruby-3x3-performance
Towards ruby-3x3-performanceVladimir Makarov
 
Linaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro
 
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Aljoscha Krettek
 
Porting To Symbian
Porting To SymbianPorting To Symbian
Porting To SymbianMark Wilcox
 
GNU GCC - what just a compiler...?
GNU GCC - what just a compiler...?GNU GCC - what just a compiler...?
GNU GCC - what just a compiler...?Saket Pathak
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPIJeff Squyres
 

Was ist angesagt? (20)

Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba...
 Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba... Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba...
Mbuf oflow - Finding vulnerabilities in iOS/macOS networking code - kevin ba...
 
Involvement in OpenHPC
Involvement in OpenHPC	Involvement in OpenHPC
Involvement in OpenHPC
 
Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer
 
OpenDataPlane Testing in Travis
OpenDataPlane Testing in TravisOpenDataPlane Testing in Travis
OpenDataPlane Testing in Travis
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem	Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem
 
Open cl programming using python syntax
Open cl programming using python syntaxOpen cl programming using python syntax
Open cl programming using python syntax
 
Kernel Recipes 2014 - kGraft: Live Patching of the Linux Kernel
Kernel Recipes 2014 - kGraft: Live Patching of the Linux KernelKernel Recipes 2014 - kGraft: Live Patching of the Linux Kernel
Kernel Recipes 2014 - kGraft: Live Patching of the Linux Kernel
 
On the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC ApplicationsOn the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC Applications
 
Segment Routing v6 (SRv6) Academy Update
Segment Routing v6 (SRv6) Academy Update Segment Routing v6 (SRv6) Academy Update
Segment Routing v6 (SRv6) Academy Update
 
BPF - All your packets belong to me
BPF - All your packets belong to meBPF - All your packets belong to me
BPF - All your packets belong to me
 
Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOF
 
Compiling P4 to XDP, IOVISOR Summit 2017
Compiling P4 to XDP, IOVISOR Summit 2017Compiling P4 to XDP, IOVISOR Summit 2017
Compiling P4 to XDP, IOVISOR Summit 2017
 
Parallel R
Parallel RParallel R
Parallel R
 
Towards ruby-3x3-performance
Towards ruby-3x3-performanceTowards ruby-3x3-performance
Towards ruby-3x3-performance
 
Linaro HPC Workshop Note
Linaro HPC Workshop NoteLinaro HPC Workshop Note
Linaro HPC Workshop Note
 
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
 
Porting To Symbian
Porting To SymbianPorting To Symbian
Porting To Symbian
 
GNU GCC - what just a compiler...?
GNU GCC - what just a compiler...?GNU GCC - what just a compiler...?
GNU GCC - what just a compiler...?
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPI
 
Circuit Simplifier
Circuit SimplifierCircuit Simplifier
Circuit Simplifier
 

Ähnlich wie [9-6-2016] Openlab Poster-v3

Intel the-latest-on-ofi
Intel the-latest-on-ofiIntel the-latest-on-ofi
Intel the-latest-on-ofiTracy Johnson
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Jace Liang
 
Building cloud-enabled genomics workflows with Luigi and Docker
Building cloud-enabled genomics workflows with Luigi and DockerBuilding cloud-enabled genomics workflows with Luigi and Docker
Building cloud-enabled genomics workflows with Luigi and DockerJacob Feala
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinarseanhefty
 
LLVM-based Communication Optimizations for PGAS Programs
LLVM-based Communication Optimizations for PGAS ProgramsLLVM-based Communication Optimizations for PGAS Programs
LLVM-based Communication Optimizations for PGAS ProgramsAkihiro Hayashi
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)Yuuki Takano
 
Realizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamRealizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamDataWorks Summit
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
Starting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 ProcessorsStarting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 ProcessorsToradex
 
A Framework For Performance Analysis Of Co-Array Fortran
A Framework For Performance Analysis Of Co-Array FortranA Framework For Performance Analysis Of Co-Array Fortran
A Framework For Performance Analysis Of Co-Array FortranDon Dooley
 
MQTT, Eclipse Paho and Java - Messaging for the Internet of Things
MQTT, Eclipse Paho and Java - Messaging for the Internet of ThingsMQTT, Eclipse Paho and Java - Messaging for the Internet of Things
MQTT, Eclipse Paho and Java - Messaging for the Internet of ThingsAndy Piper
 
Realizing the promise of portability with Apache Beam
Realizing the promise of portability with Apache BeamRealizing the promise of portability with Apache Beam
Realizing the promise of portability with Apache BeamJ On The Beach
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Foundation
 
cReComp : Automated Design Tool for ROS-Compliant FPGA Component
cReComp : Automated Design Tool  for ROS-Compliant FPGA Component cReComp : Automated Design Tool  for ROS-Compliant FPGA Component
cReComp : Automated Design Tool for ROS-Compliant FPGA Component Kazushi Yamashina
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamDataWorks Summit
 
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMPAlgoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMPPier Luca Lanzi
 
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Stefano Salsano
 

Ähnlich wie [9-6-2016] Openlab Poster-v3 (20)

Intel the-latest-on-ofi
Intel the-latest-on-ofiIntel the-latest-on-ofi
Intel the-latest-on-ofi
 
Intel the-latest-on-ofi
Intel the-latest-on-ofiIntel the-latest-on-ofi
Intel the-latest-on-ofi
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology
 
Building cloud-enabled genomics workflows with Luigi and Docker
Building cloud-enabled genomics workflows with Luigi and DockerBuilding cloud-enabled genomics workflows with Luigi and Docker
Building cloud-enabled genomics workflows with Luigi and Docker
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinar
 
LLVM-based Communication Optimizations for PGAS Programs
LLVM-based Communication Optimizations for PGAS ProgramsLLVM-based Communication Optimizations for PGAS Programs
LLVM-based Communication Optimizations for PGAS Programs
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
Realizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache BeamRealizing the Promise of Portable Data Processing with Apache Beam
Realizing the Promise of Portable Data Processing with Apache Beam
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
OpenMP
OpenMPOpenMP
OpenMP
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Starting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 ProcessorsStarting with OpenCV on i.MX 6 Processors
Starting with OpenCV on i.MX 6 Processors
 
A Framework For Performance Analysis Of Co-Array Fortran
A Framework For Performance Analysis Of Co-Array FortranA Framework For Performance Analysis Of Co-Array Fortran
A Framework For Performance Analysis Of Co-Array Fortran
 
MQTT, Eclipse Paho and Java - Messaging for the Internet of Things
MQTT, Eclipse Paho and Java - Messaging for the Internet of ThingsMQTT, Eclipse Paho and Java - Messaging for the Internet of Things
MQTT, Eclipse Paho and Java - Messaging for the Internet of Things
 
Realizing the promise of portability with Apache Beam
Realizing the promise of portability with Apache BeamRealizing the promise of portability with Apache Beam
Realizing the promise of portability with Apache Beam
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11
 
cReComp : Automated Design Tool for ROS-Compliant FPGA Component
cReComp : Automated Design Tool  for ROS-Compliant FPGA Component cReComp : Automated Design Tool  for ROS-Compliant FPGA Component
cReComp : Automated Design Tool for ROS-Compliant FPGA Component
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache Beam
 
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMPAlgoritmi e Calcolo Parallelo 2012/2013 - OpenMP
Algoritmi e Calcolo Parallelo 2012/2013 - OpenMP
 
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
 

[9-6-2016] Openlab Poster-v3

  • 1. Ioannis.Charalampidis@cern.ch Lazaros.Lazaridis@cern.ch CERN, June 2016 libFabric ofi:// Transport nanomsg ALFA usNIC FairMQ For the ALICE O2 upgrade, the simulation and reconstruction software for the ALICE experiment are using the ALFA1 framework. Among other abstractions, ALFA/FairROOT framework provides a Message-Queue abstraction library, the FairMQ2, that is lightweight wrapper around ØMQ and NanoMsg libraries. Since the project’s goal is to implement a new transport for FairMQ we decided to extend the functionality of either of these libraries. We chose NanoMsg3 because of it’s clean and modular internals. Introduc)on ØMQ fi_send( endpoint, buffer, len, mr_desc, context ); buffer … .. fi_recv( endpoint, buffer, len, mr_desc, context ); buffer … .. RDMA* Memory Region Memory Region fi_send( endpoint, buffer, len, mr_desc, context ); fi_recv( endpoint, buffer, len, mr_desc, context ); Tx CQ Rx CQ Tx CQ Rx CQ SEND ACK fi_cq_read( &event ); fi_cq_read( &event ); * libfabric has custom event polling func?ons One of the powerful features of the usNIC fabric is the fact that it can bypass the linux kernel from user-space when using the libFabric4 library. This relieves the kernel from the IP stack overhead, reclaiming it’s CPU time for more useful operations. usNIC + Kernel Bypass The ofi:// Transport The project is implemented as a patch to the NanoMsg sources5 that introduces the Open Fabrics Interface (OFI) transport. The transport translates the POSIX-like API of NanoMsg into an RDMA-like API for libFabric, transparently to the user. To achieve this it uses a dynamic memory registration (MR) mechanism that tries to reduce the amount of MR performed, while being agnostic of the user’s intentions. 1.  Technical Design Report for the Upgrade of the Online– Offline Computing System, The ALICE Collaboration 2.  https://github.com/FairRootGroup/FairRoot/tree/master/ fairmq 3.  http://nanomsg.org 4.  http://ofiwg.github.io/libfabric 5.  https://github.com/wavesoft/nanomsg-transport-ofi 6.  https://github.com/wavesoft/robob 7.  https://github.com/ofiwg/libfabric/issues?q=author %3Awavesoft 8.  https://github.com/nanomsg/nanomsg/pull/612 In a similar manner, it uses the high-level libFabric polling API, instead of the FD - based NanoMsg polling API, making it possible to support any fabric without any modification. True Zero-Copy One of the implementation requirements of the OFI transport was to ensure that no memcpy operations will take place between the user’s request and the transfer on the wire. That’s reasonable if you consider that the message sizes vary from 50Mb to 1Gb A useful by-product of this project was the development of robob6, a fully automated benchmarking utility, for ensuring the quality of the measured values By-Products Outcomes We frequently encountered roadblocks while working this project, since we were using new products and open source components. We had frequent interactions with NanoMsg and CISCO developers7 and we contributed our own modifications8. Nonetheless we managed to create a prototype were we demonstrated the feasibility of the transport and it’s performance. On the right we present some preliminary measurements using the OFI transport between two Intel Xeon E5-2690 machines with UCSC-PCIE-C40Q NICs, connected through switch with a 40Gbit copper cable. 0 5 10 15 20 25 30 35 8192 16384 32768 65536 1048576 2097152 4194304 8388608 16777216 33554432 67108864 134217728 Throughput (GB/s) for Different Message Sizes OFI [GBit/s] TCP [Gbit/s] ØMQ [Gbit/s] www.cern.ch/openlab Poster by Ioannis Charalampidis. Special thanks to our supervisor, Predrag Buncic, to Artur Barczyk for his guidance, to Mohammad Al- Turany and Peter Hristov