SlideShare ist ein Scribd-Unternehmen logo
1 von 116
Project overview, use cases, specifications,
software development and experimental activities
RINA Workshop, Dublin, January 28th –29th 2014

Investigating RINA as an Alternative to TCP/IP
Agenda
• Project overview
• Use cases
– Basic scenarios (Phases 1 and 2)
– Advanced scenarios (Phases 2 and 3)

• Specifications
– Shim DIF over 802.1Q
– PDU Forwarding Table Generator
– Y2 plans

• Software development
–
–
–
–

High level software architecture
User-space
Kernel-space
Wrap-up

• Experimental activities
–
–
–
–

Intro, goals, Y1 experimentation use case
Testbed and results at i2CAT OFELIA island
Testbed and results at iMinds OFELIA island
Conclusions
2
Project at a glance
•

What? Main goals
– To advance the state of the art of RINAtowards an architecture reference
model and specificationsthat are closerto enable implementations
deployable in production scenarios.
– The designand implementation of a RINA prototype on top of Ethernet
will enable the experimentationand evaluation of RINA in comparison to
TCP/IP.

Who?5partners
From 2014

5 activities:
 WP1: Project management
 WP2: Architecture, Use cases and
Requirements
 WP3: Software Design and
Implementation
 WP4: Deployment into OFELIA
testbed, Experimentation and
Validation
 WP5: Dissemination, Standardisation
and Exploitation

Budget
Total Cost

1.126.660 €

EC Contribution

870.000 €

Duration

2 years

Start Date

1st January 2013

External Advisory Board
Juniper Networks, ATOS,
Cisco Systems, Telecom Italia
3
Objectives (I)
• Enhancement of the RINA specifications
– The specification of a shim DIF over Ethernet
– The completion of the specifications that enable DIFs that provide a
level of service similar to the current Internet (low security, best-effort)

– The project use cases

• RINA Open Source Prototype for the Linux Operating System
– Targeting both the user and kernel spaces, allowing RINA to be used
on top of different technologies (Ethernet, TCP, UDP, etc)

– It will provide a solid baseline for further RINA work after the project.
IRATI will setup an initial open source community around the
prototype.

4
Objectives (II)
• Experimentation with RINA and comparison with TCP/IP
– IRATI will follow iterative cycles of research, design, implementation
and experimentation, with the experimental results retrofitting the
research of the next phase
– Experiments will collect and analyse data to compare RINA and
TCP/IP in various aspects like: application API, programmability, cost
of supporting multi-homing, simplicity, etc.

• Interoperability with other RINA prototypes
– The achievement of interoperability between independent
implementations is a good sign that a specification is well done and
complete.
– Current RINA prototypes target different programming platforms
(middleware vs. OS kernel) and work over different underlying
technologies (UDP/IP vs. Ethernet) compared to the IRATI prototype.
5
Objectives (III)
• Provide feedback to OFELIA
– Apart from the feedback to the OFELIA facility in terms of bug reports
and suggestions of improvements, IRATI will actively contribute to
improving the toolset used to run the facility.
– Moreover, the experimentation with a non-IP based solution is an
interesting use case for the OFELIA facility, since IRATI will be the first to
conduct these type of experiments in the OFELIA testbed.

6
Project Outcomes
•

Enhanced RINA architecture reference model and specifications,
contributed to the Pouzin Society for experimentation. IRATI will focus on
advancing the RINA state of the art in the following areas:
–
–
–
–
–

•

DIFs over Ethernet
DIFs over TCP/UDP
DIFs for hypervisors
Routing
Data transfer

Linux OS kernel implementation of the RINA prototype over Ethernet
– By the end of the project an open source community will be setup in order to
allow the research/industrial networking community to use the prototype
and/or contribute to its development

•

Experimental results of the RINA prototype, compared to TCP/IP

•

DIF over TCP/UDP extensions, interoperable with existing RINA prototypes

7
Overview of the project structure

8
Agenda
• Project overview
• Use cases
– Basic scenarios (Phases 1 and 2)
– Advanced scenarios (Phases 2 and 3)

• Specifications
– Shim DIF over 802.1Q
– PDU Forwarding Table Generator
– Y2 plans

• Software development
–
–
–
–

High level software architecture
User-space
Kernel-space
Wrap-up

• Experimental activities
–
–
–
–

Intro, goals, Y1 experimentation use case
Testbed and results at i2CAT OFELIA island
Testbed and results at iMinds OFELIA island
Conclusions
9
BASIC SCENARIOS
PHASES 1 AND 2

10
Basic use cases
Shim DIF over Ethernet
•

Goal: to ensure that the shim DIF over Ethernet provides the required
functionality. The purpose of a Shim DIF is to provide a RINA interface to
the capability of a legacy technology, rather than give the legacy
technology the full capability of a RINA DIF.

11
Basic use cases
Turing machine DIF
•

Goal: to provide a testing scenario to check a normal DIF complies with
a minimal set of functionality (the “Turing machine” DIF).

12
ADVANCED SCENARIOS
PHASES 2 AND 3

13
Advanced use cases
Introduction

•

RINA applied to a hybrid cloud/network provider
– Mixed offering of connectivity (Ethernet VPN, MPLS IP VPN, Ethernet
Private Line, Internet Access) + computing (Virtual Data Center)

Datacenter Design

Access Network
Wide Area Network

14
Advanced use cases
Modeling

PE

CE

CE

Customer 1 Site A

Customer 1 Site B

PE
CE
MPLS backbone

CE

Customer 1 Site C

PE

Customer 2 Site A

CE
PE

Customer 2 Site B

PE
Internet GW

CE

CE

TOR

TOR

TOR

TOR

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

HV VM

VM VM

HV VM

VM VM

Public Internet

HV VM

VM VM

HV VM

VM VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

HV VM

VM VM

Data Center 1

Customer 2 Site C

End
user

Data Center 2

15
Advanced use cases

Enterprise VPN over operator’s network
Wide Area Network
• Logical separation of customers through: MPLS
encapsulation, BGP-based MPLS VPNS and Virtual
Routing and Forwarding (VRF)

Access network

• Use of Ethernet switching within
metro-area networks
• Logical separation of traffic
belonging to multiple customers
implemented through IEEE 802.1Q

16
Advanced use cases

Enterprise VPN over operator’s network: Applying RINA

•

Backbone DIF: provides the equivalent of the MPLS network. This DIF must be able to
provide flows with “virtual circuit” characteristics, equivalent to MPLS LSPs.

•

Provider top-level DIF: This DIF provides IPC services to the different customers, by
connecting together the CE routers. The DIF may provide different levels of service,
depending on the customer’s requirements. There may be one or more of these DIFs
(one per customer, one for all the provider customers, etc).

•

Intra customer-site DIFs: The DIF whose scope is a single customer site. Its
characteristics will depend on the size and needs of the customer (e.g. could be a
campus network, an enterprise network, etc)

•

Customer A DIF: Can provide connectivity to all the application processes within
customer A’s organization. More specialized DIFs targeting concrete application types
(e.g. voice, file transfer) could be created on top.
17
Advanced use cases

Hypervisor integration: With TCP/IP
Virtual Machine 3

Virtual Machine 2
eth0

eth0

vif3.0
shared
memory

192.168.1.3

192.168.1.2

SW bridge 0
bridge if

eth0
VLAN 2

eth6

192.168.1.3

eth1

Top of Rack Switch
Out of the DC

eth0
shared
memory

eth3

eth0.2

bridge if

SW bridge 1

eth2

Hypervisor Machine

vif3.0

vif2.0
shared
memory

eth1.5

Virtual Machine 1

Out of the DC

eth5

Hypervisor Machine

eth1
eth1

eth0

eth0

VLAN 5

192.168.1.1
eth0

bridge if

SW bridge 0
shared
memory

vif1.0

192.168.1.2
vif2.0

shared
memory

eth0

Virtual Machine 3

eth1.5

bridge if

eth0.2

Virtual Machine 2

Virtual Machine 1

192.168.1.1

SW bridge 1
vif3.0

shared
memory

eth0

18
Advanced use cases
Hypervisor integration: With RINA

Hypervisor

Hypervisor
Green customer DIF

VM

Shim DIF over
802.1q
TOR

Shim DIF for HV
VM

VM

Out of the DC (to customer VPN
or Internet Gateway)

19
Advanced use cases

VDC + Enterprise VPNs over the Internet: With TCP/IP
Green Customer premises
Border router
Customer machines

Switch

Blue Customer premises

Border router
NAT,
Gateway

NAT,
Gateway

Customer machines

Switch

Datacentre
Border router

Public Internet

eth2

eth3

Public Internet

NAT,
Gateway

eth0

eth1

Datacenter
premises

20
Advanced use cases

VDC + Enterprise VPNs over the Internet: With RINA
Hypervisor

Hypervisor
Green customer DIF

VM
Shared
memory

Shim DIF over
802.1Q
VLAN 2

Shim DIF for HV
Shared
memory

VLAN 2

TOR

VM

VLAN 2

VM

Shared
memory
DC Border
router
Server

Shim DIF over
TCP/UDP

Datacenter
premises

Public
Internet

VLAN 10

Green Customer
premises
Customer Border
router

Server

Shim DIF over
802.1Q
VLAN 10

Layer 2
switch

VLAN 10

21
Agenda
• Project overview
• Use cases
– Basic scenarios (Phases 1 and 2)
– Advanced scenarios (Phases 2 and 3)

• Specifications
– Shim DIF over 802.1Q
– PDU Forwarding Table Generator
– Y2 plans

• Software development
–
–
–
–

High level software architecture
User-space
Kernel-space
Wrap-up

• Experimental activities
–
–
–
–

Intro, goals, Y1 experimentation use case
Testbed and results at i2CAT OFELIA island
Testbed and results at iMinds OFELIA island
Y2 plans
22
SHIM DIF OVER 802.1Q

23
Shim DIF over Ethernet
General requirements

• The task of a shim DIF is to put a small as possible veneer over a
legacy protocol to allow a RINA DIF to use it unchanged.

• The shim DIF should provide no more service or capability than
the legacy protocol provides.

24
Examining the Ethernet Header
• Ethernet II: specification released by DEC, Intel,
Xerox (hence also called DIX Ethernet)
Preamble

MAC dest

MAC src

802.1q
header
(optional)

Ethertype

Payload

FCS

Interframe
gap

7 bytes

6 bytes

6 bytes

4 bytes

2 bytes

42-1500
bytes

4 bytes

12 bytes

25
Ethertype
• Identifies the syntax of the encapsulated protocol
• Layers below need to know the syntax of the layer
above
• Layer violation!

26
Consequences of using an Ethertype
• Also means only one flow can be distinguished
between an address pair
• The MAC address doubles as the connection
endpoint-id

27
Shim DIF over Ethernet
Environment

Investigating RINA as an Alternative to TCP/IP

28
Address Resolution Protocol
• Resolves a network address to a hardware address
– Most ARP implementations do not conform to the standard
– Shim IPC process assumes RFC826 compliant implementation

30
Usage of ARP
• Maps the application process name to a shim IPC
Process address (MAC address)
– Application process name is transformed into a network
protocol address
Process name: My_IPC_Process
Process instance: 1

My_IPC_Process/1/Management/2

Entity name: Management
Entity instance: 2

– Application registration adds an entry in the local ARP cache

• Flow allocation request results in an ARP request/reply
– Instantiates a MAC protocol machine equivalent of DTP (cf.
Flow Allocator)
IRATI - Investigating RINA as an Alternative to TCP/IP
PDU FORWARDING TABLE
GENERATOR
32
PDU Forwarding Table Generator
Requirements and general choices

It’s all policy!
• Every DIF can do it its own
way
• We start with a link-state
routing approach

33
PDU Forwarding Table Generator

High-level view and relationship to other IPC Process components
IPC Process

PDU Forwarding Table Generator

Enrollment
Task

Events
N-1 flow allocated
N-1 flow deallocated
N-1 flow down
N-1 flow up

Update
knowledge on N1 flow state

Propagate
knowledge on N1 flow state

Events

Enrollment completed
successfully
Neighbor B invoked write
operation on object X

PDU Forwarding
Table

Recompute
forwarding table

Lookup PDU Forwarding
table to select output N-1
flow for each PDU

Invoke write operation on
object X to neighbor A

Relaying and
Multiplexing Task

RIB Daemon
5

6

7

1

2

3

4

N-1 Flows to nearest neighbors
(Layer management)

CDAP

Incoming CDAP messages
from neighbor IPC Processes

CDAP

Resource
Allocator

Outgoing CDAP messages to
neighbor IPC Processes

N-1 Flows to nearest
neighbors (Data Transfer)

34
Plans for Year 2
• Shim DIF for Hypervisors
– Enable communications between VMs in the same physical
machine without using the networking subsystem

• Updated shim DIF over TCP/UDP
– Current version requires manual discovery of mappings of app
names to IP addresses and TCP/UDP ports, investigate the use of
DNS

• Updated PDU Forwarding Table Generator
– Based on lessons learned from implementation and experimentation

• Feedback to EFCP
– Based on implementation and experimentation experience

• Faux sockets API
35
Agenda
• Project overview
• Use cases
– Basic scenarios (Phases 1 and 2)
– Advanced scenarios (Phases 2 and 3)

• Specifications
– Shim DIF over 802.1Q
– PDU Forwarding Table Generator
– Y2 plans

• Software development
–
–
–
–

High level software architecture
User-space
Kernel-space
Wrap-up

• Experimental activities
–
–
–
–

Intro, goals, Y1 experimentation use case
Testbed and results at i2CAT OFELIA island
Testbed and results at iMinds OFELIA island
Y2 plans
36
INTRODUCTION

37
Project’ targets and timeline (SW)

• IRATI SW goals:
•
• fx

•

Release 3 SW prototypes in 2 years
• Each prototype provides incremental functionalities
• 1st prototype: basic functionalities (unreliable flows)
• Comparable to a UDP/IP
• 2nd prototype: “complete” stack (reliable flows + routing)
• Comparable to a TCP/IP
• 3rd prototype: enhancements (hardened proto + RINA over IP + …)
• More product-like than prototype-like
• Glancing at extendibility, portability, performances & usability
The SW components lay at both kernel & user spaces
Investigating RINA as an Alternative to TCP/IP

38
Problems …
• Problems are mostly SW-engineering related
– Time constrained
1.
2.
3.

Ref-specs → HL arch
HL arch → detailed design
Detail design → implementation, debug, integration …

• Since the IRATI stack spans user and kernel spaces…
• User-space problems (as usual):
–
–
–
–
–

Memory (e.g. corruptions, leaks)
Bad logic (e.g. faults)
Concurrency (e.g. dead-locks, starvation)
…
Anything that special (but … time consuming for sure)

Investigating RINA as an Alternative to TCP/IP

39
… and problems
• Kernel space problems are the user-space ones PLUS:
– A harsher environment, e.g.

• The develop, install & test cycle is (a lot) slower

– Huge code-base (takes lot to compile)
– Faults in the kernel code may bring the whole host down
– Reboot s are usually required to test a new “version” (at early stages)

• C is “the” language → less expressive than others in userland
• No “external libraries” …

– The kernel is “cooperative”, e.g.

• Stack & heap handling must be “careful”, e.g.

– Memory corruptions could propagate everywhere

– Different mechanics, e.g.

• Mutex, semaphores, spinlocks, rcus … coupled with un-interruptable
sleeps
– Syscalls may sleep … but spinlocks can’t be held while “sleeping”

• No recursive locking
• Memory allocation is in different flavours: NOWAIT, NOIO, NOFS …

– ... … …

Investigating RINA as an Alternative to TCP/IP

40
Outline
• Introduction
• High level software architecture
• Detailed software architecture
– Kernel space
– User space

• Wrap-up

Investigating RINA as an Alternative to TCP/IP

41
Splitting the spaces: user vs kernel
Fast/slow paths → user vs kernel
• We split the “design” in different “lanes” and placed SW
components there, depending on their timing requirements
– Fast-path → stringent timings → kernel-space
– Slow-path → loose timings → user-space

• ... looking for our optimum

– fiddling with time/easiness/cost/problems/schedule/final-solution etc.

User
Kernel

Kernel
User
Investigating RINA as an Alternative to TCP/IP

43
API & kernel
•

OS Processes request services to the kernel with
syscalls
–
–

•

Modern *NIX systems extend the user/kernel
communication mechanisms
–

•

User OR kernel originated
Multicast/broadcast

We adopted syscalls and Netlink
–

Syscalls (fast-path):
•

–

Application
Application
Application
Application
Application
M

Netlink, uevent, devfs, procfs, sysfs etc.

We wanted a “bus-like” mechanism: 1:1/N:1,
user/kernel & user/user
–
–

•

User originated (user → kernel)
Unicast

Bootstrapping & SDUs R/W (fast-path)

Netlink(mostly slow-path):
•

We introduced a RINA “family” and its related
messages

IPC Process
IPC Process
Daemon
IPC Process
Daemon
Daemon

IPC Manager
Daemon

N

1

User
Kernel
Kernel
1

(*) Bootstrapping needs: Syscalls create kernel components
which will be using Netlink functionalities later on

Investigating RINA as an Alternative to TCP/IP

44
Introducing librina
• Syscalls are “wrapped” by libc (kernel abstraction)
– i.e. syscall(SYS_write, …) → write(…)
– glibc in a OS/Linux

• Changes to the syscalls → changes to glibc
– Breaking glibc could break the whole host
• Sandboxed environments are necessary

– Dependencies invalidation → Time consuming compilations
– That sort of changes are really hard to get approved
upstream
– etc.

• We introduced librina as the initial way to overcome
these problems …
– … use IRATI in a host without breaking the whole system
Investigating RINA as an Alternative to TCP/IP

45
librina
• It is more a framework/middleware than a library
– It has explicit memory allocation (no garbage collection)
– It’s event-based
– It’s threaded

• Completely abstract the interactions with the kernel
– syscalls and Netlink

• Adds functionalities upon them
• Provides them to userland (apps & daemons)
– Static/dynamic linking (i.e. for C/C++ programs)
– Scripting language extensions (i.e. Java)

Investigating RINA as an Alternative to TCP/IP

46
librina interface
• librina contains a set of “components”:
– Internal components
– External components

• And a portable framework to build components on
top, e.g.:
– Patterns: e.g. singletons, observers, factories, reactors
– Concurrency: e.g. threads, mutexes, semaphores, condition
variables
– High level “objects” in its core
• FlowSpecification, QoSCube, RIBObject etc.

• Only the “external “components are “exported” as
classes
Investigating RINA as an Alternative to TCP/IP

47
librina core (HL) SW architecture
• Configure PDU Forwarding Table
• Create / delete EFCP instances
• Allocation of kernel resources to support a flow

• Creation
• Deletion
• Configuration

Application
eventPoll()
eventWait()

• Allocate / deallocate flows
• Read / write SDUs to flows
• Register/unregister to 1+ DIF(s)

eventPost()

common

cdap

faux-sockets

sdu-protection

ipc-process

ipc-manager

application

API

framework

Core components

Event Queue

NetlinkManager

librina
NetlinkSession
NetlinkSession
NetlinkSessions

RINA
Manager

nl_send() / nl_recv()

Syscall wrappers
syscall(SYS_*)

libnl / libnl_genl

User
kernel
RINA Netlink

Investigating RINA as an Alternative to TCP/IP

RINA syscalls

50
How to RAD, effectively ?
• OO was the “natural” way to represent the RINA entities
• We embraced C++ as the “core” language for librina:
– Careful usage produces binaries comparable to C
– The STL reduces the dependencies
• in the plain C vs plain C++ case

– Producing C bindings is possible
– …

…

• There was the ALBA prototype already working …
• … and ALBA has RINABand …
• BUT that prototype is Java based …
Investigating RINA as an Alternative to TCP/IP

51
Interfacing librina to other languages
• We “adopted” SWIG: the Software Wrapper and Interface
Generator
• SWIG “automatically” generates all the code needed to
connect C/C++ programs to scripting languages
– Such as Python, Java and many, many others …

example.h
int fact(int n);

example.c
#include "example.h"

example.i
/* File: example.i */
%module example

SWIG

%{
#include "example.h"
%}

High level
wrapper

int fact(int n);

int fact(int n) { … }

Low level
wrapper

example_wrap.c
GCC

Native
interface
libexample.so

Investigating RINA as an Alternative to TCP/IP

example.py

Python

52
librina wrapping
• Wrapping “cost”:
– The wrappers (.i files) are small: ~480 LOCs
– They produce ~13.5 KLOCS bindings → ~1/28 ratio …

• The wrappers are the only thing needed to obtain the
bindings for a scripting language
– SWIG support vary on the target language, i.e.
• Java: so-so (not all data-types mapped natively)
• Python: good
• …

– Our wrappers contain only the missing data-type mappings for
Java

• Java interface = C++ interface
• Bindings for other languages (i.e. Python) are expected to
be straightforward
Investigating RINA as an Alternative to TCP/IP

53
High level software architecture
RINABand HL

RINABand HL

ipcpd

ipcmd

RINABand LL

rinad
(Java)

Language X
imports

Third parties
SW Packages
(Applications)

Java “imports”
SWIG HL wrappers
(Language X)

SWIG HL wrappers (Java)
JNI

Language X “NI”
SWIG LL wrappers
(C++, for language X)

SWIG LL wrappers
(C++, for Java)

librina

API (C)

Static/dynamic
linking

API (C++)
Core (C++)
libnl / libnl-gen
syscalls

Netlink
Kernel

Investigating RINA as an Alternative to TCP/IP

54
DETAILED SOFTWARE ARCHITECTURE
KERNEL SPACE

55
The Linux object model
•

Linux has its “generic” object abstraction: kobject, kref and kset

Garbage collection &SysFS integration
structkref { atomic_trefcount; }

Naming &sysfs

structkobject {
const char *
name;
structkset {
structlist_headentry;
structlist_headlist;
structkobject *
parent;
spinlock_tklist_lock;
structkset *
kset;
structkobjectkobj;
structkobj_type *
ktype;
const structksetset_uevent_ops * uevent_ops;
structsysfs_dirent * sd;
};
structkrefkref;
unsigned int state_initialized:1;
unsigned int state_in_sysfs:1;
Objects (dynamic) [re-]parenting
unsigned int state_add_uevent_sent:1;
unsigned int state_remove_uevent_sent:1;
(loosely typed)
unsigned int uevent_suppress:1;
};

Objects grouping

SysFS integration
•

Generic enough to be applied “everywhere”

References counting (explicit)

– E.g. FS, HW Subsystems, Device drivers

Investigating RINA as an Alternative to TCP/IP

56
kobjects, ksets and krefs in IRATI
• They are the way to go for embracing OOD/OOP kernel-wide

• If the design has a “limite scope” the code gets bloated for:
– Ancillary functions & data structures
– (unnecessary) Resources usage

• We don’t need/want all these functionalities (everywhere):
– Reduced (finite) number of classes

• We don’t have the needs of a “generic kernel”

– Reduced concurrency (can be missing, depending on the object)
– Object parenting is “fixed”(obj x is always bound to obj y)
• E.g. DTP/DTCP are bound to EFCP …

– Not all our objects have to be published into sysfs
– We have different lookups requirements
• No needs to “look-up by name” every object

– Inter-objects bindings shouldn’t loose the object’ type
– …

Investigating RINA as an Alternative to TCP/IP

57
Our OOP/OOD approach
•
•
•
•

We adopted a (slightly) different OOD/OOP approach
(almost) Each “entity” in the stack is an “object”
All our “objects” provide a basic common interface & behavior
They have no implicit embedded locking semantics
structobject_t{ … };

API opaque

structobj_ops_t {
result_x_t (* method_1)(object_t * o, …);
…
result_y_t (* method_n)(object_t * o, …);
};

Static

Dynamic

vtable (if needed)

intobj_init(object_t * o, …);
void
obj_fini(object_t * o);

Interruptablectxt

object_t * obj_create(…);
object_t * obj_create_ni(…);
intobj_destroy(object_t * o);

Non-interruptablectxt

intobj_<method_1>(object_t * o, …);
...
intobj_<method_n>(object_t * o, …);

vtable proxy (if needed)

Investigating RINA as an Alternative to TCP/IP

58
OOD/OOP & the framework
• This approach:

– Reduces the stack (overall) bloating

• no krefs, spinlocks, sysfs etc. where unnecessary
• Only objects requiring sysfs, debugfs and/or uevents embed a kobject

– (or it is comparable)

• E.g. the same bloating related to _init, _fini, _create and _destroy

– Speeds-up the developments
– Helps debugging

• (re-)Parenting is constrained to specific objects
• No loose-typing → type-checking is maintained (no casts)

– Decouples (mildly) from the underlying kernel

• With these assumptions we built our framework

– Basic components: robj, rmem, rqueue, rfifo, rref, rtimer, rwq, rmap,
rbmp
– OOP facilities/Patterns: Factories, singletons, facades, observers,
flyweights, publisher/subscribers, smartpointers, etc.
– Ownership-passing + smart-pointing memory model

Investigating RINA as an Alternative to TCP/IP

59
The HL software architecture (Y1)
rinad
RINABand HL

ipcpd

Third parties
SW Packages

ipcmd

SWIG HL wrappers (Java)

SWIG HL wrappers
(Language X)

SWIG LL wrappers
(C++, for Java)

rinad

SWIG LL wrappers
(C++, for language X)

User
space
librina

librina

Framework

API (C)
API (C++)
Core (C++)
libnl / libnl-gen
syscalls

Netlink
Personality mux/demux

KIPCM
core

RNL

IPCP Factories
Framework

Kernel
space

KFA

kernel

KIPCM

shim-eth-vlan

Normal IPC P.
PFT

RMT

EFCP

shim-dummy

RINA-ARP

Investigating RINA as an Alternative to TCP/IP

62
The API exposed to user-space:
KIPCM + RNL
• Kernel interface = syscalls + Netlink messages
• KIPCM:
– Manages the syscalls
• Syscalls: a small-numbered, well defined set of calls (#8) :
– IPCs: ipc_create and ipc_destroy
– Flows: allocate_portand deallocate_port
– SDUs: sdu_read, sdu_write, mgmt_sdu_read and mgmt_sdu_write

• RNL:
– Manages the Netlink part
• Abstracts message’s reception, sending, parsing & crafting
• Netlink: #36 message types (with dynamic attributes):
– assign_to_dif_req, assign_to_dif_resp, dif_reg_notif, dif_unreg_notif…

• Partitioning:
– Syscalls→ KIPCM → “Fast-path” (read and write SDUs)
– Netlink→ RNL → “Slow-path” (mostly conf and mgmt)

Investigating RINA as an Alternative to TCP/IP

63
KIPCM & KFA
•

The KIPCM:

– Counterpart of the IPC Manager in user-space
– Manages the lifecycle the IPC Processes and KFA
– Abstract IPC Process instances
• Same API for all the IPC Processes regardless the
type
• maps: ipc-process-id → ipc-process-instance

•

KIPCM
KFA

Manages ports and flows

– Ports

• Flow handler and ID
• Port ID Manager

– Flows

• maps: port-id → ipc-process-instance

Normal
IPCP

EFCP

Both “bind” the kernel stack:
–
–

•

syscalls
Netlink

The KFA
–

•

User space

Top: user-interface
Bottom: ipc processes (maps)

–

When KIPCM calls KFA to inject/get SDUs:
• N-IPCP → EFCP → RMT → PDU-FWD → Shim/IPC
Process

Shim
IPCP

RMT

They are the Initial point where “recursion” is
transformed into “iteration”

Investigating RINA as an Alternative to TCP/IP

PDU-FWD-T

OUT

IN

64
The RINA Netlink Layer (RNL)
• Integrates Netlink in the SW framework
– Hides all the configuration, generation and destruction of Netlink sockets and
messages from the user

– Defines a Generic Netlink family (NETLINK_RINA) and its messages

Investigating RINA as an Alternative to TCP/IP

66
The IPC Process Factories
• They are used by IPC Processes to publish/unpublish their availability
– Publish:
• x = kipcm_ipcp_factory_register(…, char * name, …)

– Unpublish:
• kipcm_ipcp_factory_unregister(x)

• The factory name is the way KIPCM can look for a
specific IPC Process type
– It’s published into sysfs too

• There are two “major” types of IPC Processes :
– Normal
– Shims
Investigating RINA as an Alternative to TCP/IP

67
The IPC Process Factories Interface
• Factory operations are the same for both types

• Upon registration
– A factory publishes its hooks

.init
.fini
.create
.destroy
.configure

→
→
→
→
→

x_init
x_fini
x_create
x_destroy
x_configure

• Upon user-request (ipc_create)
– The KIPCM creates a particular IPC Process instance
1.
2.
3.
4.

Looks for the correct factory (by name)
Calls the .create “method”
The factory returns a “compliant” IPC Process object
Binds that object into its data model

• Upon un-registration
– The factory triggers the “destruction” of all the IPC Processes
it “owns”
Investigating RINA as an Alternative to TCP/IP

68
IPC Process Instances
• The .create provided to the factories returns an IPC
Process “object”
• There are two “major” types of IPC Processes:
– Normal
– Shims

• Regardless of its type
– The interface is the same
– Each IPC Process implements its “core” code:
• Shim IPC Process:
– Each Shim IPC Processes provide its implementation

• Normal IPC Process:
– The stack provides an implementation for all of them

Investigating RINA as an Alternative to TCP/IP

69
IPC Process Instances Interface
• The IPC Process “object”

• instance_data
• instance_ops

• The IPC Process Interface is the same for all types,
but each type decides which ops will support
– Some are specific for normal or shim, a few are
common to both
instance_ops

•
•
•
•
•
•
•

.application_register
= x_application_register
.application_unregister = x_application_unregister
.assign_to_dif
= x_assign_to_dif
.sdu_write
= x_sdu_write
.flow_allocate_request = shim_allocate_request
.flow_allocate_response = shim_allocate_response
.flow_deallocate
= shim_deallocate

•
•
•
•
•
•
•

.connection_create
= normal_ connection_create
. connection_update
= normal _ connection_update
. connection_destroy
= normal _ connection_destroy
.connection_create_arrived = normal _connection_arrived
.pft_add
= normal_pft_add
. pft_remove
= normal_pft_remove
. pft_dump
= normal_pft_dump

– They support similar functionalities (except the PFT’s)
– How they translate into ops depends on the type
Investigating RINA as an Alternative to TCP/IP

70
Write operation
sys_sdu_write(sdu, app2)

APP

User space
Kernel space

port_idapp2

kipcm_sdu_write(sdu, app2)

IPCP 2
EFCPC 2

EFCP 2i

efcp_container_write(sdu, 2i)

dtp_write(sdu)
DTP

efcp_write(sdu)

KIPCM

normal_write(sdu, app2)
kfa_flow_sdu_write(sdu, app2)

rmt_send(pdu)

RMT 2

kfa_flow_sdu_write(sdu*, 21)

KFA

port_id 21

IPCP 1

EFCPC 1

EFCP 1j

dtp_write(sdu*)
DTP

efcp_container_write(sdu*, 1j)

efcp_write(sdu*)

normal_write(sdu*, 21)

rmt_send(pdu*)

RMT 1

kfa_flow_sdu_write(sdu**, 10)

Pid10

IPCP 0

SHIM

shim_write(sdu**, 21)
Read operation
sys_sdu_read(app2)

APP
port_idapp2

User space
Kernel space

IPCP 2

EFCPC 2

kipcm_sdu_read(app2)

kfa_sdu_post(sdu, app2)

EFCP 2i

DTP

KIPCM

dtp_receive(pdu)
efcp_receive(pdu)
efcp_container_receive(pdu, 2i)

RMT 2

kfa_flow_sdu_read(app2)

rmt_receive(sdu*, 21)

KFA

port_id 21

IPCP 1

EFCPC 1

EFCP 1j

kfa_sdu_post(sdu*, 21)

DTP
dtp_receive(pdu*)
efcp_receive(pdu*)

efcp_container_receive(pdu*, 1j)

RMT 1

rmt_receive(sdu**, 10)

port_id 10

IPCP 0

SHIM

kfa_sdu_post(sdu**, 10)
Shim IPC Processes
• The shims are the “lowest” components in the kernelspace
• They have two interfaces:
– NB: The same for each shim, represented by hooks published
into KIPCM factories
– SB: Depends on the technology

• There are currently 2 shims:
– shim-dummy:
• Confined into a single host (“loopback”)
• Used for debugging & testing the stack

– shim-eth-vlan:
• As defined in the spec, runs over 802.1Q
Investigating RINA as an Alternative to TCP/IP

73
Shim-dummy
IPC
Process
Daemon

IPC
Manager
Daemon

User-space
Kernel
KIPCM / KFA

shim_dummy_create
shim_dummy_destroy

RINA IPC API

Dummy shim IPC Process
IRATI - Investigating RINA as an Alternative to TCP/IP
Shim-eth-vlan
IPC
Process
Daemon

IPC
Manager
Daemon

User-space
Kernel
KIPCM / KFA

shim_eth_create

shim_eth_destroy

rinarp_add
RINA IPC API

Shim IPC Process over 802.1Q

rinarp_remove

RINARP
rinarp_resolve

shim_eth_rcv
dev_queue_xmit

Devices layer
IRATI - Investigating RINA as an Alternative to TCP/IP
RINARP
shim-eth-vlan
ARP826

Maps

Core

Tables
RINARP

API

TX

RX

ARM

Devices
Layer

IRATI - Investigating RINA as an Alternative to TCP/IP
DETAILED SOFTWARE ARCHITECTURE
USER SPACE

78
Introduction to the user space framework
IPC Manager
Daemon

Main logic

IDD

RIB & RIB
Daemon

Manageme
nt agent

Normal IPC Process
IPC(Layer Management)
Process Daemon
Enrollment
(Layer Management)

librina
Application A
Application A
Application A
Application logic

Netlink
sockets

System
calls

Netlink
sockets

Sysfs
Netlink
sockets

PDU
Forwarding
Table
Generation

Flow
allocation

librina
System calls

RIB & RIB
Daemon

Resource
allocation

librina
System calls

Netlink
sockets

Sysfs

User space
Kernel

•
•
•

IPC Manager Daemon: Broker between apps & IPC Processes, central point of Management
in the system
IPC Process Daemon: Implements the layer management components of an IPC Process
Librina: Abstracts out the communication details between daemons and the kernel
79
Librina software architecture
Perform action

Get event

API (C++)
Message
Message

classes
Proxy
classes
classes

Message
Message

classes
Model
classes
classes

Event
Producer

Message
Message

classes
Event
classes
classes

Events queue

Concurrency
classes

Core (C++)

libpthread

Message
Message

Message
reader
Thread

classes
Message
classes
classes

Netlink Manager

Syscall wrappers

Logging
framework

Netlink Message
Parsers /
Formatters

libnl/libnl-gen

User space
Kernel

80
The IPC Process and IPC Manager
Daemons
• IPC Manager Daemon
–
–
–
–

Manages the IPC Processes lifecycle
Broker between applications and IPC Processes
Local management agent
DIF Allocator client (to search for applications not available through local DIFs)

• IPC Process Daemon
– Layer Management components of the IPC Process
• RIB Daemon, RIB,
• CDAP parsers/generators
• CACEP
• Enrollment

• Flow Allocation
• Resource Allocation
• PDU Forwarding Table Generation
• Security Management
81
IPC Manager Daemon
Message
Message

IPC Manager Daemon (Java)

classes
Console
classes
classes

IPC Manager core classes

IPC Process
Manager

Flow Manager

Application
Registration
Manager

Call operation on IPC
Manager core classes

Command
Line
Interface
Server
Thread

Operation result

Call IPC Process Factory, IPC
Process or Application
Manager

local TCP
Connection

CLI Session

Message

Call operation on IPC
Manager core classes

Main event
loop

Message
Configura
classes
classes
tion
classes

Bootstrapper
Configuration file

EventProducer.eventWait()

EventProducer.eventWait()

SWIG Wrappers (high-level, Java)
Java Native Interface (JNI)
SWIG Wrappers (Low-level, C++)
librina (C++)

IPC
Process

IPC Process
Factory

Message
Message

classes
Model
classes
classes

Message
Message

classes
Event
classes
classes

Event
Producer

Application Manager

System calls

Netlink Messages

83
IPC Process Daemon
IPC Process Daemon (Java)
Supporting classes

Delimite
r

CDAP
parser

Encoder

Layer Management function classes

Enrollment
Task

Flow
Allocator

Resource
Allocator

Registration
Manager

Forwarding
Table
Generator

RIB Daemon

Resource
Information
Base (RIB)
RIBDaemon.
sendCDAPMessage()
RIBDaemon.cdapMessageReceived()

Call IPCManager or
KernelIPCProcess

CDAP
Message
reader
Thread

Main event
loop
EventProducer.eventWait()

KernelIPCProcess.writeMgmtSDU()

KernelIPCProcess.readMgmtSDU()

SWIG Wrappers (high-level, Java)

Java Native Interface (JNI)
SWIG Wrappers (Low-level, C++)
librina (C++)

KernelIP
C
Process

IPC
Manager

System calls

Message
Message

classes
Model
classes
classes

Message
Message

classes
Event
classes
classes

Netlink Messages

Event
Producer

85
Example workflow : IPC Process creation
•

The IPC Manager reads a configuration file with instructions on the IPC
Processes it has to create at startup
–

•

Or the system administrator can request creation through the local console

The configuration file also instructs the IPC Manager to register the IPC
Process in one or more N-1 DIFs, and to make it member of a DIF
3. Initialize librina
4. When completed notify IPC Manager (NL)
local TCP
Connection

10. Update state and forward to Kernel (NL)

5. IPC Process initialized (NL)
CLI Session

OR

8. Notify IPC Process registered (NL)

IPC Manager Daemon

9. Assign to DIF request (NL)

IPC Process
Daemon

13. Assign to DIF response (NL)
Configuration file

1. Create
IPC Process
(syscall)

6. Register
2.
app
Fork(syscall
request(NL)
)

7. Register app
response (NL)

11. Assign to
DIF request
(NL)

12. Assign to
DIF response
(NL)

User space
Kernel

86
Example workflow : Flow allocation
•

An application requests a flow to another application, without
specifying what DIF to use
2. Check app permissions
3. Decide what DIF to use
4. Forward request to adequate IPC Process Daemon
5. Allocate Flow Request (NL)

1. Allocate Flow
Request (NL)

IPC Manager
Daemon
12. Forward response to app

Application A

13. Allocate Flow
Request Result (NL)

14. Read data from the
flow (syscall) or write
data to the flow (syscall)

User space

11. Allocate Flow Request Result (NL)

IPC Process
Daemon
6. Request port-id (syscall)
7. Create connection request (NL)
8. On create connection response
(NL), write CDAP message to N-1
port (syscall)
9. On getting an incoming CDAP
message response (syscall),
update connection (NL)
10. On getting update connection
response (NL) reply to IPC
Manager (NL)

Kernel

87
WRAP UP

88
Y1: Where we are / What do we have…
• 9 months, ~3700 commits and ~214 KLOCs later …
–
–
–
–

~27 KLOCs in the kernel;
~87 KLOCs in the librina (hand-written);
~35 KLOCS in the librina (automatically generated);
~65 KLOCs in rinad

• .. the project released its 1st prototype (internal release):
– User and kernel space components providing unreliable flow
functionalities
– We have the building|configuration|development frameworks
– A testing framework
• A testing application (RINABand, compilation-time)
• A regression framework (ad-hoc, run-time)

• We’re actively working on the 2nd prototype
Investigating RINA as an Alternative to TCP/IP

89
Y2: Plans …
• Prototype 2:
– Reliable flows support
– Shim DIF for HV
• Same schema as shim-dummy/shim-eth-vlan as in prototype 1

– Complete routing
– Public release as FOSS (July 2014)

• Prototype 3:
– Shim DIF over TCP/UDP
• same schema as prototype 2

– Faux sockets API via
1. FI: Functions interposition (dynamic linking)
2. SCI: System calls interposition (static linking)

Investigating RINA as an Alternative to TCP/IP

90
Agenda
• Project overview
• Use cases
– Basic scenarios (Phases 1 and 2)
– Advanced scenarios (Phases 2 and 3)

• Specifications
– Shim DIF over 802.1Q
– PDU Forwarding Table Generator
– Y2 plans

• Software development
–
–
–
–

High level software architecture
User-space
Kernel-space
Wrap-up

• Experimental activities
–
–
–
–

Intro, goals, Y1 experimentation use case
Testbed and results at i2CAT OFELIA island
Testbed and results at iMinds OFELIA island
Conclusions
92
IRATI EXPERIMENTATION GOALS

Investigating RINA as an Alternative to TCP/IP

93
Experimentation goals
TCP/IP
UDP/IP

RINA
prototype

Use Cases

Specifications

Investigating RINA as an Alternative to TCP/IP

94
IRATI experimentation in a nutshell
Phase I

Phase III

Phase II

PSOC
OFELIA

OFELIA

iLab.t

iLab.t

iLab.t
EXPERI
MENTA

OFELIA

EXPERI
MENTA

Investigating RINA as an Alternative to TCP/IP

OFELIA

EXPERI
MENTA

95
PROTOTYPE STATUS AND TOOLS

Investigating RINA as an Alternative to TCP/IP

96
Available Tools
• Rinaband

RINABand
1

RINABandClient
1

Data
Contr
– Test application for RINA
AE
ol
AE
– Java (user space)
– Requires multiple flows between to Api’s

1 control flow
N data flows

Contr
ol
AE

Data
AE

DIF

• Echoserver/client
– test parameters number and size of SDUs to be sent
– Ping-like operation
– The test completes when either all the SDUs have been sent and
received, or when more than a certain interval of time elapses
without receiving an SDU.
– client and server report statistics
• the number of transmitted and received SDUs
• time the test lasted.

– Single flow between two Api’s
Investigating RINA as an Alternative to TCP/IP

97
First Phase Prototype capabilities
• Capabilities
– Decision to focus on the Shim- ETH-VLAN
– Supports only a single flow between two APi’s
Preamble

MAC dest

MAC src

802.1q
header
(optional)

Ethertype

Payload

FCS

Interframe
gap

7 bytes

6 bytes

6 bytes

4 bytes

2 bytes

42-1500
bytes

4 bytes

12 bytes

• Impact on experiments
– Could not use RinaBand
– Rely on Echoserver/client application

Investigating RINA as an Alternative to TCP/IP

98
FIRST PHASE EXPERIMENTS

Investigating RINA as an Alternative to TCP/IP

99
First phase use case

Investigating RINA as an Alternative to TCP/IP

100
Single flow echo/bw test

•Validate Stack / Prototype 1
•Validate Ethernet transparency
•Measure goodput
Investigating RINA as an Alternative to TCP/IP

101
Multiple flow echo/bw validation

•Validate multiple IPC processes
•Measure goodput

Investigating RINA as an Alternative to TCP/IP

102
Concurrent RINA and IP

•Validate concurrency IP and RINA stack
•Measure goodput

Investigating RINA as an Alternative to TCP/IP

103
Presented by Leonardo Bergesio

FIRST PHASE RESULTS @ I2CAT

Investigating RINA as an Alternative to TCP/IP

104
i2CAT OFELIA Island, EXPERIMENTA
• Experiment == slice
• FlowSpace:
– Arbitrary Topology
– Partition of the
vectorial space of OF
header fields
– Slicing by VLANs

• VMs to be used as
end points or
controllers
• Perfect march:
– SLICE  VLAN  Shim DIF over Ethernet
Investigating RINA as an Alternative to TCP/IP

105
Workflow I
• Access island using OCF. Create or access your
project/slice

Investigating RINA as an Alternative to TCP/IP

106
Workflow II
• Select FlowSpace Topology and slice VLAN/s (DIFs)

Investigating RINA as an Alternative to TCP/IP

107
Workflow III
• Create VMs  Nodes and OpenFlow Controller

Investigating RINA as an Alternative to TCP/IP

108
Resources Mapping
SlicewithtwoVLANsids,
one per DIF: 300, 301

Investigating RINA as an Alternative to TCP/IP

109
Single flow

Packets are sent over the Ethernet/VLAN bridge
Goodput roughly 60% of Link capacity (iperf tested)

Investigating RINA as an Alternative to TCP/IP

Project: IRATIbasicusecase
Slice:
multivlanslice

111
Multiple flows

Flows to shared server (B & C to D)achieved half
the throughput than the single flow (A to B)

Investigating RINA as an Alternative to TCP/IP

Project: IRATIbasicusecase
Slice:
multivlanslice

112
Concurrency between IP and RINA
stack

Project: IRATIbasicusecase
Slice:
multivlanslice
UDP

Time Interval
90s

Nº of datagrams
554915

Data sent
778 MB

BW
75.5 Mbps

Investigating RINA as an Alternative to TCP/IP

113
FIRST PHASE RESULTS @ IMINDS

Investigating RINA as an Alternative to TCP/IP

114
iLab.t “Virtual Wall”: Concept

115
Virtual Wall: Topology Control

116
Virtual Wall: Topology Control

117
Virtual wall @ iMinds

Investigating RINA as an Alternative to TCP/IP

118
Emulab: architecture
Internet

Web/DB/SNMP
emulab ArchitectureSwitch Mgmt
Users

PowerCntl

Control Switch/Router
Serial

PC

PC

168

Programmable “Patch Panel”
p.119
Emulab: programmable patch panel

p. 120
Workflow

Experiment
idea
GUI

Emulab runs
the additional
scripts from
ns file

ns script

Hardware
Mapping
and swap in

Investigating RINA as an Alternative to TCP/IP

Additionalscripting

121
Basic Experiment on iMinds island
• Use a LAN for the VLAN bridge

Investigating RINA as an Alternative to TCP/IP

122
Single flow

Packets are sent over the Ethernet/VLAN bridge
Goodput roughly 60% Iperf bandwidth

Investigating RINA as an Alternative to TCP/IP

123
Multiple flows

Investigating RINA as an Alternative to TCP/IP

124
Concurrency between IP and RINA stack

Start Echo Server

UDP

Investigating RINA as an Alternative to TCP/IP

125
CONCLÚIDÍ

Investigating RINA as an Alternative to TCP/IP

126
Conclusions from phase I
experimentation
•
•
•
•

IRATI stack and Shim DIF are running
~60% goodput in comparison to iperf
No major performance problems
When running concurrently, the IRATI stack take
precedence over the IP stack
– our stack doesn't loose a packet from syscalls to devs-layer

• ARP in Shim DIF should not reuse 0x0806 ETHERTYPE
because of incompatibility with existing
implementations
• Registration to Shim-DIF over Ethernet should be
explicit
Investigating RINA as an Alternative to TCP/IP

127
Thanks for your attention!
Questions?

Investigating RINA as an Alternative to TCP/IP

Weitere ähnliche Inhalte

Was ist angesagt?

RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
Eleni Trouva
 
Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012
Eleni Trouva
 
IRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSIRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OS
ICT PRISTINE
 

Was ist angesagt? (20)

Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
 
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
 
Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012
 
RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013
 
Irati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopIrati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA Workshop
 
3. RINA use cases, results, benefits
3. RINA use cases, results, benefits3. RINA use cases, results, benefits
3. RINA use cases, results, benefits
 
1. RINA motivation - TF Workshop
1. RINA motivation - TF Workshop1. RINA motivation - TF Workshop
1. RINA motivation - TF Workshop
 
IRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSIRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OS
 
Architectures and buildings
Architectures and buildingsArchitectures and buildings
Architectures and buildings
 
Pristine glif 2015
Pristine glif 2015Pristine glif 2015
Pristine glif 2015
 
RINA Tutorial at ETSI ISG NGP#3
RINA Tutorial at ETSI ISG NGP#3RINA Tutorial at ETSI ISG NGP#3
RINA Tutorial at ETSI ISG NGP#3
 
2. RINA overview - TF workshop
2. RINA overview - TF workshop2. RINA overview - TF workshop
2. RINA overview - TF workshop
 
Pristine rina-sdk-icc-2016
Pristine rina-sdk-icc-2016Pristine rina-sdk-icc-2016
Pristine rina-sdk-icc-2016
 
EU-Taiwan Workshop on 5G Research, PRISTINE introduction
EU-Taiwan Workshop on 5G Research, PRISTINE introductionEU-Taiwan Workshop on 5G Research, PRISTINE introduction
EU-Taiwan Workshop on 5G Research, PRISTINE introduction
 
Intro RINA
Intro RINAIntro RINA
Intro RINA
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017
 
Rlite software-architecture (1)
Rlite software-architecture (1)Rlite software-architecture (1)
Rlite software-architecture (1)
 
Advanced network experiments in FED4FIRE
Advanced network experiments in FED4FIREAdvanced network experiments in FED4FIRE
Advanced network experiments in FED4FIRE
 
Introduction to OpenFlow
Introduction to OpenFlowIntroduction to OpenFlow
Introduction to OpenFlow
 
Arcfire fire forum 2015
Arcfire fire forum 2015Arcfire fire forum 2015
Arcfire fire forum 2015
 

Andere mochten auch

Andere mochten auch (16)

RINA Tutorial @ IEEE Globecom 2014
RINA Tutorial @ IEEE Globecom 2014RINA Tutorial @ IEEE Globecom 2014
RINA Tutorial @ IEEE Globecom 2014
 
A Wake-Up Call for IoT
A Wake-Up Call for IoT A Wake-Up Call for IoT
A Wake-Up Call for IoT
 
3 addressingthe problem130123
3 addressingthe problem1301233 addressingthe problem130123
3 addressingthe problem130123
 
Assuring QoS Guarantees for Heterogeneous Services in RINA Networks with ΔQ
Assuring QoS Guarantees for Heterogeneous Services in RINA Networks with ΔQAssuring QoS Guarantees for Heterogeneous Services in RINA Networks with ΔQ
Assuring QoS Guarantees for Heterogeneous Services in RINA Networks with ΔQ
 
10 myths about cloud computing
10 myths about cloud computing10 myths about cloud computing
10 myths about cloud computing
 
Rina acc-icc16-stein
Rina acc-icc16-steinRina acc-icc16-stein
Rina acc-icc16-stein
 
The hague rina-workshop-nfv-diego
The hague rina-workshop-nfv-diegoThe hague rina-workshop-nfv-diego
The hague rina-workshop-nfv-diego
 
The hague rina-workshop-mobility-eduard
The hague rina-workshop-mobility-eduardThe hague rina-workshop-mobility-eduard
The hague rina-workshop-mobility-eduard
 
The hague rina-workshop-welcome-miguel
The hague rina-workshop-welcome-miguelThe hague rina-workshop-welcome-miguel
The hague rina-workshop-welcome-miguel
 
The hageu rina-workshop-security-peter
The hageu rina-workshop-security-peterThe hageu rina-workshop-security-peter
The hageu rina-workshop-security-peter
 
Congestion Control in Recursive Network Architectures
Congestion Control in Recursive Network ArchitecturesCongestion Control in Recursive Network Architectures
Congestion Control in Recursive Network Architectures
 
Th hauge rina-workshop-sdn-virtualisation_neil
Th hauge rina-workshop-sdn-virtualisation_neilTh hauge rina-workshop-sdn-virtualisation_neil
Th hauge rina-workshop-sdn-virtualisation_neil
 
The hague rina-workshop-interop-deployment_vincenzo
The hague rina-workshop-interop-deployment_vincenzoThe hague rina-workshop-interop-deployment_vincenzo
The hague rina-workshop-interop-deployment_vincenzo
 
The hague rina-workshop-congestioncontrol-peyman
The hague rina-workshop-congestioncontrol-peymanThe hague rina-workshop-congestioncontrol-peyman
The hague rina-workshop-congestioncontrol-peyman
 
Rina sim workshop
Rina sim workshopRina sim workshop
Rina sim workshop
 
Pristine rina-security-icc-2016
Pristine rina-security-icc-2016Pristine rina-security-icc-2016
Pristine rina-security-icc-2016
 

Ähnlich wie IRATI @ RINA Workshop 2014, Dublin

Colt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-final
Colt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-finalColt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-final
Colt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-final
Rafael Junquera
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
Thomas Graf
 

Ähnlich wie IRATI @ RINA Workshop 2014, Dublin (20)

Colt SDN Strategy - Telesemana December 2013
Colt SDN Strategy - Telesemana December 2013Colt SDN Strategy - Telesemana December 2013
Colt SDN Strategy - Telesemana December 2013
 
Colt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-final
Colt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-finalColt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-final
Colt sdn-strategy-telesemana-diciembre-2013-javier-benitez-colt-final
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
 
Network Virtualization & Software-defined Networking
Network Virtualization & Software-defined NetworkingNetwork Virtualization & Software-defined Networking
Network Virtualization & Software-defined Networking
 
SDN/NFV: Service Chaining
SDN/NFV: Service Chaining SDN/NFV: Service Chaining
SDN/NFV: Service Chaining
 
Colt VCPE and NFV at L123 SDN WC 2015
Colt VCPE and NFV at L123 SDN WC 2015Colt VCPE and NFV at L123 SDN WC 2015
Colt VCPE and NFV at L123 SDN WC 2015
 
Edge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different PiecesEdge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different Pieces
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Generic network architecture discussion
Generic network architecture discussionGeneric network architecture discussion
Generic network architecture discussion
 
Colt sdn-and-nfv-experience-lernings-and-future-plans
Colt sdn-and-nfv-experience-lernings-and-future-plansColt sdn-and-nfv-experience-lernings-and-future-plans
Colt sdn-and-nfv-experience-lernings-and-future-plans
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video Streaming
 
Colt's SDN/NFV Vision
Colt's SDN/NFV VisionColt's SDN/NFV Vision
Colt's SDN/NFV Vision
 
Colt SDN Strategy - FIBRE Workshop 5 Nov 2013 Barcelona
Colt SDN Strategy - FIBRE Workshop 5 Nov 2013 BarcelonaColt SDN Strategy - FIBRE Workshop 5 Nov 2013 Barcelona
Colt SDN Strategy - FIBRE Workshop 5 Nov 2013 Barcelona
 
Mellanox Approach to NFV & SDN
Mellanox Approach to NFV & SDNMellanox Approach to NFV & SDN
Mellanox Approach to NFV & SDN
 
5G in Brownfield how SDN makes 5G Deployments Work
5G in Brownfield how SDN makes 5G Deployments Work5G in Brownfield how SDN makes 5G Deployments Work
5G in Brownfield how SDN makes 5G Deployments Work
 
Updates on NFV and SDN Activities from the Broadband Forum
Updates on NFV and SDN Activities from the Broadband ForumUpdates on NFV and SDN Activities from the Broadband Forum
Updates on NFV and SDN Activities from the Broadband Forum
 
Netsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvNetsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfv
 
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) ArchitectureNFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
 
Rina IRATI @ GLIF Singapoure -2013
Rina IRATI @ GLIF Singapoure -2013Rina IRATI @ GLIF Singapoure -2013
Rina IRATI @ GLIF Singapoure -2013
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

IRATI @ RINA Workshop 2014, Dublin

  • 1. Project overview, use cases, specifications, software development and experimental activities RINA Workshop, Dublin, January 28th –29th 2014 Investigating RINA as an Alternative to TCP/IP
  • 2. Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Conclusions 2
  • 3. Project at a glance • What? Main goals – To advance the state of the art of RINAtowards an architecture reference model and specificationsthat are closerto enable implementations deployable in production scenarios. – The designand implementation of a RINA prototype on top of Ethernet will enable the experimentationand evaluation of RINA in comparison to TCP/IP. Who?5partners From 2014 5 activities:  WP1: Project management  WP2: Architecture, Use cases and Requirements  WP3: Software Design and Implementation  WP4: Deployment into OFELIA testbed, Experimentation and Validation  WP5: Dissemination, Standardisation and Exploitation Budget Total Cost 1.126.660 € EC Contribution 870.000 € Duration 2 years Start Date 1st January 2013 External Advisory Board Juniper Networks, ATOS, Cisco Systems, Telecom Italia 3
  • 4. Objectives (I) • Enhancement of the RINA specifications – The specification of a shim DIF over Ethernet – The completion of the specifications that enable DIFs that provide a level of service similar to the current Internet (low security, best-effort) – The project use cases • RINA Open Source Prototype for the Linux Operating System – Targeting both the user and kernel spaces, allowing RINA to be used on top of different technologies (Ethernet, TCP, UDP, etc) – It will provide a solid baseline for further RINA work after the project. IRATI will setup an initial open source community around the prototype. 4
  • 5. Objectives (II) • Experimentation with RINA and comparison with TCP/IP – IRATI will follow iterative cycles of research, design, implementation and experimentation, with the experimental results retrofitting the research of the next phase – Experiments will collect and analyse data to compare RINA and TCP/IP in various aspects like: application API, programmability, cost of supporting multi-homing, simplicity, etc. • Interoperability with other RINA prototypes – The achievement of interoperability between independent implementations is a good sign that a specification is well done and complete. – Current RINA prototypes target different programming platforms (middleware vs. OS kernel) and work over different underlying technologies (UDP/IP vs. Ethernet) compared to the IRATI prototype. 5
  • 6. Objectives (III) • Provide feedback to OFELIA – Apart from the feedback to the OFELIA facility in terms of bug reports and suggestions of improvements, IRATI will actively contribute to improving the toolset used to run the facility. – Moreover, the experimentation with a non-IP based solution is an interesting use case for the OFELIA facility, since IRATI will be the first to conduct these type of experiments in the OFELIA testbed. 6
  • 7. Project Outcomes • Enhanced RINA architecture reference model and specifications, contributed to the Pouzin Society for experimentation. IRATI will focus on advancing the RINA state of the art in the following areas: – – – – – • DIFs over Ethernet DIFs over TCP/UDP DIFs for hypervisors Routing Data transfer Linux OS kernel implementation of the RINA prototype over Ethernet – By the end of the project an open source community will be setup in order to allow the research/industrial networking community to use the prototype and/or contribute to its development • Experimental results of the RINA prototype, compared to TCP/IP • DIF over TCP/UDP extensions, interoperable with existing RINA prototypes 7
  • 8. Overview of the project structure 8
  • 9. Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Conclusions 9
  • 11. Basic use cases Shim DIF over Ethernet • Goal: to ensure that the shim DIF over Ethernet provides the required functionality. The purpose of a Shim DIF is to provide a RINA interface to the capability of a legacy technology, rather than give the legacy technology the full capability of a RINA DIF. 11
  • 12. Basic use cases Turing machine DIF • Goal: to provide a testing scenario to check a normal DIF complies with a minimal set of functionality (the “Turing machine” DIF). 12
  • 14. Advanced use cases Introduction • RINA applied to a hybrid cloud/network provider – Mixed offering of connectivity (Ethernet VPN, MPLS IP VPN, Ethernet Private Line, Internet Access) + computing (Virtual Data Center) Datacenter Design Access Network Wide Area Network 14
  • 15. Advanced use cases Modeling PE CE CE Customer 1 Site A Customer 1 Site B PE CE MPLS backbone CE Customer 1 Site C PE Customer 2 Site A CE PE Customer 2 Site B PE Internet GW CE CE TOR TOR TOR TOR HV VM VM VM HV VM VM VM HV VM VM VM HV VM HV VM VM VM HV VM VM VM Public Internet HV VM VM VM HV VM VM VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM HV VM VM VM Data Center 1 Customer 2 Site C End user Data Center 2 15
  • 16. Advanced use cases Enterprise VPN over operator’s network Wide Area Network • Logical separation of customers through: MPLS encapsulation, BGP-based MPLS VPNS and Virtual Routing and Forwarding (VRF) Access network • Use of Ethernet switching within metro-area networks • Logical separation of traffic belonging to multiple customers implemented through IEEE 802.1Q 16
  • 17. Advanced use cases Enterprise VPN over operator’s network: Applying RINA • Backbone DIF: provides the equivalent of the MPLS network. This DIF must be able to provide flows with “virtual circuit” characteristics, equivalent to MPLS LSPs. • Provider top-level DIF: This DIF provides IPC services to the different customers, by connecting together the CE routers. The DIF may provide different levels of service, depending on the customer’s requirements. There may be one or more of these DIFs (one per customer, one for all the provider customers, etc). • Intra customer-site DIFs: The DIF whose scope is a single customer site. Its characteristics will depend on the size and needs of the customer (e.g. could be a campus network, an enterprise network, etc) • Customer A DIF: Can provide connectivity to all the application processes within customer A’s organization. More specialized DIFs targeting concrete application types (e.g. voice, file transfer) could be created on top. 17
  • 18. Advanced use cases Hypervisor integration: With TCP/IP Virtual Machine 3 Virtual Machine 2 eth0 eth0 vif3.0 shared memory 192.168.1.3 192.168.1.2 SW bridge 0 bridge if eth0 VLAN 2 eth6 192.168.1.3 eth1 Top of Rack Switch Out of the DC eth0 shared memory eth3 eth0.2 bridge if SW bridge 1 eth2 Hypervisor Machine vif3.0 vif2.0 shared memory eth1.5 Virtual Machine 1 Out of the DC eth5 Hypervisor Machine eth1 eth1 eth0 eth0 VLAN 5 192.168.1.1 eth0 bridge if SW bridge 0 shared memory vif1.0 192.168.1.2 vif2.0 shared memory eth0 Virtual Machine 3 eth1.5 bridge if eth0.2 Virtual Machine 2 Virtual Machine 1 192.168.1.1 SW bridge 1 vif3.0 shared memory eth0 18
  • 19. Advanced use cases Hypervisor integration: With RINA Hypervisor Hypervisor Green customer DIF VM Shim DIF over 802.1q TOR Shim DIF for HV VM VM Out of the DC (to customer VPN or Internet Gateway) 19
  • 20. Advanced use cases VDC + Enterprise VPNs over the Internet: With TCP/IP Green Customer premises Border router Customer machines Switch Blue Customer premises Border router NAT, Gateway NAT, Gateway Customer machines Switch Datacentre Border router Public Internet eth2 eth3 Public Internet NAT, Gateway eth0 eth1 Datacenter premises 20
  • 21. Advanced use cases VDC + Enterprise VPNs over the Internet: With RINA Hypervisor Hypervisor Green customer DIF VM Shared memory Shim DIF over 802.1Q VLAN 2 Shim DIF for HV Shared memory VLAN 2 TOR VM VLAN 2 VM Shared memory DC Border router Server Shim DIF over TCP/UDP Datacenter premises Public Internet VLAN 10 Green Customer premises Customer Border router Server Shim DIF over 802.1Q VLAN 10 Layer 2 switch VLAN 10 21
  • 22. Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Y2 plans 22
  • 23. SHIM DIF OVER 802.1Q 23
  • 24. Shim DIF over Ethernet General requirements • The task of a shim DIF is to put a small as possible veneer over a legacy protocol to allow a RINA DIF to use it unchanged. • The shim DIF should provide no more service or capability than the legacy protocol provides. 24
  • 25. Examining the Ethernet Header • Ethernet II: specification released by DEC, Intel, Xerox (hence also called DIX Ethernet) Preamble MAC dest MAC src 802.1q header (optional) Ethertype Payload FCS Interframe gap 7 bytes 6 bytes 6 bytes 4 bytes 2 bytes 42-1500 bytes 4 bytes 12 bytes 25
  • 26. Ethertype • Identifies the syntax of the encapsulated protocol • Layers below need to know the syntax of the layer above • Layer violation! 26
  • 27. Consequences of using an Ethertype • Also means only one flow can be distinguished between an address pair • The MAC address doubles as the connection endpoint-id 27
  • 28. Shim DIF over Ethernet Environment Investigating RINA as an Alternative to TCP/IP 28
  • 29. Address Resolution Protocol • Resolves a network address to a hardware address – Most ARP implementations do not conform to the standard – Shim IPC process assumes RFC826 compliant implementation 30
  • 30. Usage of ARP • Maps the application process name to a shim IPC Process address (MAC address) – Application process name is transformed into a network protocol address Process name: My_IPC_Process Process instance: 1 My_IPC_Process/1/Management/2 Entity name: Management Entity instance: 2 – Application registration adds an entry in the local ARP cache • Flow allocation request results in an ARP request/reply – Instantiates a MAC protocol machine equivalent of DTP (cf. Flow Allocator) IRATI - Investigating RINA as an Alternative to TCP/IP
  • 32. PDU Forwarding Table Generator Requirements and general choices It’s all policy! • Every DIF can do it its own way • We start with a link-state routing approach 33
  • 33. PDU Forwarding Table Generator High-level view and relationship to other IPC Process components IPC Process PDU Forwarding Table Generator Enrollment Task Events N-1 flow allocated N-1 flow deallocated N-1 flow down N-1 flow up Update knowledge on N1 flow state Propagate knowledge on N1 flow state Events Enrollment completed successfully Neighbor B invoked write operation on object X PDU Forwarding Table Recompute forwarding table Lookup PDU Forwarding table to select output N-1 flow for each PDU Invoke write operation on object X to neighbor A Relaying and Multiplexing Task RIB Daemon 5 6 7 1 2 3 4 N-1 Flows to nearest neighbors (Layer management) CDAP Incoming CDAP messages from neighbor IPC Processes CDAP Resource Allocator Outgoing CDAP messages to neighbor IPC Processes N-1 Flows to nearest neighbors (Data Transfer) 34
  • 34. Plans for Year 2 • Shim DIF for Hypervisors – Enable communications between VMs in the same physical machine without using the networking subsystem • Updated shim DIF over TCP/UDP – Current version requires manual discovery of mappings of app names to IP addresses and TCP/UDP ports, investigate the use of DNS • Updated PDU Forwarding Table Generator – Based on lessons learned from implementation and experimentation • Feedback to EFCP – Based on implementation and experimentation experience • Faux sockets API 35
  • 35. Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Y2 plans 36
  • 37. Project’ targets and timeline (SW) • IRATI SW goals: • • fx • Release 3 SW prototypes in 2 years • Each prototype provides incremental functionalities • 1st prototype: basic functionalities (unreliable flows) • Comparable to a UDP/IP • 2nd prototype: “complete” stack (reliable flows + routing) • Comparable to a TCP/IP • 3rd prototype: enhancements (hardened proto + RINA over IP + …) • More product-like than prototype-like • Glancing at extendibility, portability, performances & usability The SW components lay at both kernel & user spaces Investigating RINA as an Alternative to TCP/IP 38
  • 38. Problems … • Problems are mostly SW-engineering related – Time constrained 1. 2. 3. Ref-specs → HL arch HL arch → detailed design Detail design → implementation, debug, integration … • Since the IRATI stack spans user and kernel spaces… • User-space problems (as usual): – – – – – Memory (e.g. corruptions, leaks) Bad logic (e.g. faults) Concurrency (e.g. dead-locks, starvation) … Anything that special (but … time consuming for sure) Investigating RINA as an Alternative to TCP/IP 39
  • 39. … and problems • Kernel space problems are the user-space ones PLUS: – A harsher environment, e.g. • The develop, install & test cycle is (a lot) slower – Huge code-base (takes lot to compile) – Faults in the kernel code may bring the whole host down – Reboot s are usually required to test a new “version” (at early stages) • C is “the” language → less expressive than others in userland • No “external libraries” … – The kernel is “cooperative”, e.g. • Stack & heap handling must be “careful”, e.g. – Memory corruptions could propagate everywhere – Different mechanics, e.g. • Mutex, semaphores, spinlocks, rcus … coupled with un-interruptable sleeps – Syscalls may sleep … but spinlocks can’t be held while “sleeping” • No recursive locking • Memory allocation is in different flavours: NOWAIT, NOIO, NOFS … – ... … … Investigating RINA as an Alternative to TCP/IP 40
  • 40. Outline • Introduction • High level software architecture • Detailed software architecture – Kernel space – User space • Wrap-up Investigating RINA as an Alternative to TCP/IP 41
  • 41. Splitting the spaces: user vs kernel Fast/slow paths → user vs kernel • We split the “design” in different “lanes” and placed SW components there, depending on their timing requirements – Fast-path → stringent timings → kernel-space – Slow-path → loose timings → user-space • ... looking for our optimum – fiddling with time/easiness/cost/problems/schedule/final-solution etc. User Kernel Kernel User Investigating RINA as an Alternative to TCP/IP 43
  • 42. API & kernel • OS Processes request services to the kernel with syscalls – – • Modern *NIX systems extend the user/kernel communication mechanisms – • User OR kernel originated Multicast/broadcast We adopted syscalls and Netlink – Syscalls (fast-path): • – Application Application Application Application Application M Netlink, uevent, devfs, procfs, sysfs etc. We wanted a “bus-like” mechanism: 1:1/N:1, user/kernel & user/user – – • User originated (user → kernel) Unicast Bootstrapping & SDUs R/W (fast-path) Netlink(mostly slow-path): • We introduced a RINA “family” and its related messages IPC Process IPC Process Daemon IPC Process Daemon Daemon IPC Manager Daemon N 1 User Kernel Kernel 1 (*) Bootstrapping needs: Syscalls create kernel components which will be using Netlink functionalities later on Investigating RINA as an Alternative to TCP/IP 44
  • 43. Introducing librina • Syscalls are “wrapped” by libc (kernel abstraction) – i.e. syscall(SYS_write, …) → write(…) – glibc in a OS/Linux • Changes to the syscalls → changes to glibc – Breaking glibc could break the whole host • Sandboxed environments are necessary – Dependencies invalidation → Time consuming compilations – That sort of changes are really hard to get approved upstream – etc. • We introduced librina as the initial way to overcome these problems … – … use IRATI in a host without breaking the whole system Investigating RINA as an Alternative to TCP/IP 45
  • 44. librina • It is more a framework/middleware than a library – It has explicit memory allocation (no garbage collection) – It’s event-based – It’s threaded • Completely abstract the interactions with the kernel – syscalls and Netlink • Adds functionalities upon them • Provides them to userland (apps & daemons) – Static/dynamic linking (i.e. for C/C++ programs) – Scripting language extensions (i.e. Java) Investigating RINA as an Alternative to TCP/IP 46
  • 45. librina interface • librina contains a set of “components”: – Internal components – External components • And a portable framework to build components on top, e.g.: – Patterns: e.g. singletons, observers, factories, reactors – Concurrency: e.g. threads, mutexes, semaphores, condition variables – High level “objects” in its core • FlowSpecification, QoSCube, RIBObject etc. • Only the “external “components are “exported” as classes Investigating RINA as an Alternative to TCP/IP 47
  • 46. librina core (HL) SW architecture • Configure PDU Forwarding Table • Create / delete EFCP instances • Allocation of kernel resources to support a flow • Creation • Deletion • Configuration Application eventPoll() eventWait() • Allocate / deallocate flows • Read / write SDUs to flows • Register/unregister to 1+ DIF(s) eventPost() common cdap faux-sockets sdu-protection ipc-process ipc-manager application API framework Core components Event Queue NetlinkManager librina NetlinkSession NetlinkSession NetlinkSessions RINA Manager nl_send() / nl_recv() Syscall wrappers syscall(SYS_*) libnl / libnl_genl User kernel RINA Netlink Investigating RINA as an Alternative to TCP/IP RINA syscalls 50
  • 47. How to RAD, effectively ? • OO was the “natural” way to represent the RINA entities • We embraced C++ as the “core” language for librina: – Careful usage produces binaries comparable to C – The STL reduces the dependencies • in the plain C vs plain C++ case – Producing C bindings is possible – … … • There was the ALBA prototype already working … • … and ALBA has RINABand … • BUT that prototype is Java based … Investigating RINA as an Alternative to TCP/IP 51
  • 48. Interfacing librina to other languages • We “adopted” SWIG: the Software Wrapper and Interface Generator • SWIG “automatically” generates all the code needed to connect C/C++ programs to scripting languages – Such as Python, Java and many, many others … example.h int fact(int n); example.c #include "example.h" example.i /* File: example.i */ %module example SWIG %{ #include "example.h" %} High level wrapper int fact(int n); int fact(int n) { … } Low level wrapper example_wrap.c GCC Native interface libexample.so Investigating RINA as an Alternative to TCP/IP example.py Python 52
  • 49. librina wrapping • Wrapping “cost”: – The wrappers (.i files) are small: ~480 LOCs – They produce ~13.5 KLOCS bindings → ~1/28 ratio … • The wrappers are the only thing needed to obtain the bindings for a scripting language – SWIG support vary on the target language, i.e. • Java: so-so (not all data-types mapped natively) • Python: good • … – Our wrappers contain only the missing data-type mappings for Java • Java interface = C++ interface • Bindings for other languages (i.e. Python) are expected to be straightforward Investigating RINA as an Alternative to TCP/IP 53
  • 50. High level software architecture RINABand HL RINABand HL ipcpd ipcmd RINABand LL rinad (Java) Language X imports Third parties SW Packages (Applications) Java “imports” SWIG HL wrappers (Language X) SWIG HL wrappers (Java) JNI Language X “NI” SWIG LL wrappers (C++, for language X) SWIG LL wrappers (C++, for Java) librina API (C) Static/dynamic linking API (C++) Core (C++) libnl / libnl-gen syscalls Netlink Kernel Investigating RINA as an Alternative to TCP/IP 54
  • 52. The Linux object model • Linux has its “generic” object abstraction: kobject, kref and kset Garbage collection &SysFS integration structkref { atomic_trefcount; } Naming &sysfs structkobject { const char * name; structkset { structlist_headentry; structlist_headlist; structkobject * parent; spinlock_tklist_lock; structkset * kset; structkobjectkobj; structkobj_type * ktype; const structksetset_uevent_ops * uevent_ops; structsysfs_dirent * sd; }; structkrefkref; unsigned int state_initialized:1; unsigned int state_in_sysfs:1; Objects (dynamic) [re-]parenting unsigned int state_add_uevent_sent:1; unsigned int state_remove_uevent_sent:1; (loosely typed) unsigned int uevent_suppress:1; }; Objects grouping SysFS integration • Generic enough to be applied “everywhere” References counting (explicit) – E.g. FS, HW Subsystems, Device drivers Investigating RINA as an Alternative to TCP/IP 56
  • 53. kobjects, ksets and krefs in IRATI • They are the way to go for embracing OOD/OOP kernel-wide • If the design has a “limite scope” the code gets bloated for: – Ancillary functions & data structures – (unnecessary) Resources usage • We don’t need/want all these functionalities (everywhere): – Reduced (finite) number of classes • We don’t have the needs of a “generic kernel” – Reduced concurrency (can be missing, depending on the object) – Object parenting is “fixed”(obj x is always bound to obj y) • E.g. DTP/DTCP are bound to EFCP … – Not all our objects have to be published into sysfs – We have different lookups requirements • No needs to “look-up by name” every object – Inter-objects bindings shouldn’t loose the object’ type – … Investigating RINA as an Alternative to TCP/IP 57
  • 54. Our OOP/OOD approach • • • • We adopted a (slightly) different OOD/OOP approach (almost) Each “entity” in the stack is an “object” All our “objects” provide a basic common interface & behavior They have no implicit embedded locking semantics structobject_t{ … }; API opaque structobj_ops_t { result_x_t (* method_1)(object_t * o, …); … result_y_t (* method_n)(object_t * o, …); }; Static Dynamic vtable (if needed) intobj_init(object_t * o, …); void obj_fini(object_t * o); Interruptablectxt object_t * obj_create(…); object_t * obj_create_ni(…); intobj_destroy(object_t * o); Non-interruptablectxt intobj_<method_1>(object_t * o, …); ... intobj_<method_n>(object_t * o, …); vtable proxy (if needed) Investigating RINA as an Alternative to TCP/IP 58
  • 55. OOD/OOP & the framework • This approach: – Reduces the stack (overall) bloating • no krefs, spinlocks, sysfs etc. where unnecessary • Only objects requiring sysfs, debugfs and/or uevents embed a kobject – (or it is comparable) • E.g. the same bloating related to _init, _fini, _create and _destroy – Speeds-up the developments – Helps debugging • (re-)Parenting is constrained to specific objects • No loose-typing → type-checking is maintained (no casts) – Decouples (mildly) from the underlying kernel • With these assumptions we built our framework – Basic components: robj, rmem, rqueue, rfifo, rref, rtimer, rwq, rmap, rbmp – OOP facilities/Patterns: Factories, singletons, facades, observers, flyweights, publisher/subscribers, smartpointers, etc. – Ownership-passing + smart-pointing memory model Investigating RINA as an Alternative to TCP/IP 59
  • 56. The HL software architecture (Y1) rinad RINABand HL ipcpd Third parties SW Packages ipcmd SWIG HL wrappers (Java) SWIG HL wrappers (Language X) SWIG LL wrappers (C++, for Java) rinad SWIG LL wrappers (C++, for language X) User space librina librina Framework API (C) API (C++) Core (C++) libnl / libnl-gen syscalls Netlink Personality mux/demux KIPCM core RNL IPCP Factories Framework Kernel space KFA kernel KIPCM shim-eth-vlan Normal IPC P. PFT RMT EFCP shim-dummy RINA-ARP Investigating RINA as an Alternative to TCP/IP 62
  • 57. The API exposed to user-space: KIPCM + RNL • Kernel interface = syscalls + Netlink messages • KIPCM: – Manages the syscalls • Syscalls: a small-numbered, well defined set of calls (#8) : – IPCs: ipc_create and ipc_destroy – Flows: allocate_portand deallocate_port – SDUs: sdu_read, sdu_write, mgmt_sdu_read and mgmt_sdu_write • RNL: – Manages the Netlink part • Abstracts message’s reception, sending, parsing & crafting • Netlink: #36 message types (with dynamic attributes): – assign_to_dif_req, assign_to_dif_resp, dif_reg_notif, dif_unreg_notif… • Partitioning: – Syscalls→ KIPCM → “Fast-path” (read and write SDUs) – Netlink→ RNL → “Slow-path” (mostly conf and mgmt) Investigating RINA as an Alternative to TCP/IP 63
  • 58. KIPCM & KFA • The KIPCM: – Counterpart of the IPC Manager in user-space – Manages the lifecycle the IPC Processes and KFA – Abstract IPC Process instances • Same API for all the IPC Processes regardless the type • maps: ipc-process-id → ipc-process-instance • KIPCM KFA Manages ports and flows – Ports • Flow handler and ID • Port ID Manager – Flows • maps: port-id → ipc-process-instance Normal IPCP EFCP Both “bind” the kernel stack: – – • syscalls Netlink The KFA – • User space Top: user-interface Bottom: ipc processes (maps) – When KIPCM calls KFA to inject/get SDUs: • N-IPCP → EFCP → RMT → PDU-FWD → Shim/IPC Process Shim IPCP RMT They are the Initial point where “recursion” is transformed into “iteration” Investigating RINA as an Alternative to TCP/IP PDU-FWD-T OUT IN 64
  • 59. The RINA Netlink Layer (RNL) • Integrates Netlink in the SW framework – Hides all the configuration, generation and destruction of Netlink sockets and messages from the user – Defines a Generic Netlink family (NETLINK_RINA) and its messages Investigating RINA as an Alternative to TCP/IP 66
  • 60. The IPC Process Factories • They are used by IPC Processes to publish/unpublish their availability – Publish: • x = kipcm_ipcp_factory_register(…, char * name, …) – Unpublish: • kipcm_ipcp_factory_unregister(x) • The factory name is the way KIPCM can look for a specific IPC Process type – It’s published into sysfs too • There are two “major” types of IPC Processes : – Normal – Shims Investigating RINA as an Alternative to TCP/IP 67
  • 61. The IPC Process Factories Interface • Factory operations are the same for both types • Upon registration – A factory publishes its hooks .init .fini .create .destroy .configure → → → → → x_init x_fini x_create x_destroy x_configure • Upon user-request (ipc_create) – The KIPCM creates a particular IPC Process instance 1. 2. 3. 4. Looks for the correct factory (by name) Calls the .create “method” The factory returns a “compliant” IPC Process object Binds that object into its data model • Upon un-registration – The factory triggers the “destruction” of all the IPC Processes it “owns” Investigating RINA as an Alternative to TCP/IP 68
  • 62. IPC Process Instances • The .create provided to the factories returns an IPC Process “object” • There are two “major” types of IPC Processes: – Normal – Shims • Regardless of its type – The interface is the same – Each IPC Process implements its “core” code: • Shim IPC Process: – Each Shim IPC Processes provide its implementation • Normal IPC Process: – The stack provides an implementation for all of them Investigating RINA as an Alternative to TCP/IP 69
  • 63. IPC Process Instances Interface • The IPC Process “object” • instance_data • instance_ops • The IPC Process Interface is the same for all types, but each type decides which ops will support – Some are specific for normal or shim, a few are common to both instance_ops • • • • • • • .application_register = x_application_register .application_unregister = x_application_unregister .assign_to_dif = x_assign_to_dif .sdu_write = x_sdu_write .flow_allocate_request = shim_allocate_request .flow_allocate_response = shim_allocate_response .flow_deallocate = shim_deallocate • • • • • • • .connection_create = normal_ connection_create . connection_update = normal _ connection_update . connection_destroy = normal _ connection_destroy .connection_create_arrived = normal _connection_arrived .pft_add = normal_pft_add . pft_remove = normal_pft_remove . pft_dump = normal_pft_dump – They support similar functionalities (except the PFT’s) – How they translate into ops depends on the type Investigating RINA as an Alternative to TCP/IP 70
  • 64. Write operation sys_sdu_write(sdu, app2) APP User space Kernel space port_idapp2 kipcm_sdu_write(sdu, app2) IPCP 2 EFCPC 2 EFCP 2i efcp_container_write(sdu, 2i) dtp_write(sdu) DTP efcp_write(sdu) KIPCM normal_write(sdu, app2) kfa_flow_sdu_write(sdu, app2) rmt_send(pdu) RMT 2 kfa_flow_sdu_write(sdu*, 21) KFA port_id 21 IPCP 1 EFCPC 1 EFCP 1j dtp_write(sdu*) DTP efcp_container_write(sdu*, 1j) efcp_write(sdu*) normal_write(sdu*, 21) rmt_send(pdu*) RMT 1 kfa_flow_sdu_write(sdu**, 10) Pid10 IPCP 0 SHIM shim_write(sdu**, 21)
  • 65. Read operation sys_sdu_read(app2) APP port_idapp2 User space Kernel space IPCP 2 EFCPC 2 kipcm_sdu_read(app2) kfa_sdu_post(sdu, app2) EFCP 2i DTP KIPCM dtp_receive(pdu) efcp_receive(pdu) efcp_container_receive(pdu, 2i) RMT 2 kfa_flow_sdu_read(app2) rmt_receive(sdu*, 21) KFA port_id 21 IPCP 1 EFCPC 1 EFCP 1j kfa_sdu_post(sdu*, 21) DTP dtp_receive(pdu*) efcp_receive(pdu*) efcp_container_receive(pdu*, 1j) RMT 1 rmt_receive(sdu**, 10) port_id 10 IPCP 0 SHIM kfa_sdu_post(sdu**, 10)
  • 66. Shim IPC Processes • The shims are the “lowest” components in the kernelspace • They have two interfaces: – NB: The same for each shim, represented by hooks published into KIPCM factories – SB: Depends on the technology • There are currently 2 shims: – shim-dummy: • Confined into a single host (“loopback”) • Used for debugging & testing the stack – shim-eth-vlan: • As defined in the spec, runs over 802.1Q Investigating RINA as an Alternative to TCP/IP 73
  • 67. Shim-dummy IPC Process Daemon IPC Manager Daemon User-space Kernel KIPCM / KFA shim_dummy_create shim_dummy_destroy RINA IPC API Dummy shim IPC Process IRATI - Investigating RINA as an Alternative to TCP/IP
  • 68. Shim-eth-vlan IPC Process Daemon IPC Manager Daemon User-space Kernel KIPCM / KFA shim_eth_create shim_eth_destroy rinarp_add RINA IPC API Shim IPC Process over 802.1Q rinarp_remove RINARP rinarp_resolve shim_eth_rcv dev_queue_xmit Devices layer IRATI - Investigating RINA as an Alternative to TCP/IP
  • 71. Introduction to the user space framework IPC Manager Daemon Main logic IDD RIB & RIB Daemon Manageme nt agent Normal IPC Process IPC(Layer Management) Process Daemon Enrollment (Layer Management) librina Application A Application A Application A Application logic Netlink sockets System calls Netlink sockets Sysfs Netlink sockets PDU Forwarding Table Generation Flow allocation librina System calls RIB & RIB Daemon Resource allocation librina System calls Netlink sockets Sysfs User space Kernel • • • IPC Manager Daemon: Broker between apps & IPC Processes, central point of Management in the system IPC Process Daemon: Implements the layer management components of an IPC Process Librina: Abstracts out the communication details between daemons and the kernel 79
  • 72. Librina software architecture Perform action Get event API (C++) Message Message classes Proxy classes classes Message Message classes Model classes classes Event Producer Message Message classes Event classes classes Events queue Concurrency classes Core (C++) libpthread Message Message Message reader Thread classes Message classes classes Netlink Manager Syscall wrappers Logging framework Netlink Message Parsers / Formatters libnl/libnl-gen User space Kernel 80
  • 73. The IPC Process and IPC Manager Daemons • IPC Manager Daemon – – – – Manages the IPC Processes lifecycle Broker between applications and IPC Processes Local management agent DIF Allocator client (to search for applications not available through local DIFs) • IPC Process Daemon – Layer Management components of the IPC Process • RIB Daemon, RIB, • CDAP parsers/generators • CACEP • Enrollment • Flow Allocation • Resource Allocation • PDU Forwarding Table Generation • Security Management 81
  • 74. IPC Manager Daemon Message Message IPC Manager Daemon (Java) classes Console classes classes IPC Manager core classes IPC Process Manager Flow Manager Application Registration Manager Call operation on IPC Manager core classes Command Line Interface Server Thread Operation result Call IPC Process Factory, IPC Process or Application Manager local TCP Connection CLI Session Message Call operation on IPC Manager core classes Main event loop Message Configura classes classes tion classes Bootstrapper Configuration file EventProducer.eventWait() EventProducer.eventWait() SWIG Wrappers (high-level, Java) Java Native Interface (JNI) SWIG Wrappers (Low-level, C++) librina (C++) IPC Process IPC Process Factory Message Message classes Model classes classes Message Message classes Event classes classes Event Producer Application Manager System calls Netlink Messages 83
  • 75. IPC Process Daemon IPC Process Daemon (Java) Supporting classes Delimite r CDAP parser Encoder Layer Management function classes Enrollment Task Flow Allocator Resource Allocator Registration Manager Forwarding Table Generator RIB Daemon Resource Information Base (RIB) RIBDaemon. sendCDAPMessage() RIBDaemon.cdapMessageReceived() Call IPCManager or KernelIPCProcess CDAP Message reader Thread Main event loop EventProducer.eventWait() KernelIPCProcess.writeMgmtSDU() KernelIPCProcess.readMgmtSDU() SWIG Wrappers (high-level, Java) Java Native Interface (JNI) SWIG Wrappers (Low-level, C++) librina (C++) KernelIP C Process IPC Manager System calls Message Message classes Model classes classes Message Message classes Event classes classes Netlink Messages Event Producer 85
  • 76. Example workflow : IPC Process creation • The IPC Manager reads a configuration file with instructions on the IPC Processes it has to create at startup – • Or the system administrator can request creation through the local console The configuration file also instructs the IPC Manager to register the IPC Process in one or more N-1 DIFs, and to make it member of a DIF 3. Initialize librina 4. When completed notify IPC Manager (NL) local TCP Connection 10. Update state and forward to Kernel (NL) 5. IPC Process initialized (NL) CLI Session OR 8. Notify IPC Process registered (NL) IPC Manager Daemon 9. Assign to DIF request (NL) IPC Process Daemon 13. Assign to DIF response (NL) Configuration file 1. Create IPC Process (syscall) 6. Register 2. app Fork(syscall request(NL) ) 7. Register app response (NL) 11. Assign to DIF request (NL) 12. Assign to DIF response (NL) User space Kernel 86
  • 77. Example workflow : Flow allocation • An application requests a flow to another application, without specifying what DIF to use 2. Check app permissions 3. Decide what DIF to use 4. Forward request to adequate IPC Process Daemon 5. Allocate Flow Request (NL) 1. Allocate Flow Request (NL) IPC Manager Daemon 12. Forward response to app Application A 13. Allocate Flow Request Result (NL) 14. Read data from the flow (syscall) or write data to the flow (syscall) User space 11. Allocate Flow Request Result (NL) IPC Process Daemon 6. Request port-id (syscall) 7. Create connection request (NL) 8. On create connection response (NL), write CDAP message to N-1 port (syscall) 9. On getting an incoming CDAP message response (syscall), update connection (NL) 10. On getting update connection response (NL) reply to IPC Manager (NL) Kernel 87
  • 79. Y1: Where we are / What do we have… • 9 months, ~3700 commits and ~214 KLOCs later … – – – – ~27 KLOCs in the kernel; ~87 KLOCs in the librina (hand-written); ~35 KLOCS in the librina (automatically generated); ~65 KLOCs in rinad • .. the project released its 1st prototype (internal release): – User and kernel space components providing unreliable flow functionalities – We have the building|configuration|development frameworks – A testing framework • A testing application (RINABand, compilation-time) • A regression framework (ad-hoc, run-time) • We’re actively working on the 2nd prototype Investigating RINA as an Alternative to TCP/IP 89
  • 80. Y2: Plans … • Prototype 2: – Reliable flows support – Shim DIF for HV • Same schema as shim-dummy/shim-eth-vlan as in prototype 1 – Complete routing – Public release as FOSS (July 2014) • Prototype 3: – Shim DIF over TCP/UDP • same schema as prototype 2 – Faux sockets API via 1. FI: Functions interposition (dynamic linking) 2. SCI: System calls interposition (static linking) Investigating RINA as an Alternative to TCP/IP 90
  • 81. Agenda • Project overview • Use cases – Basic scenarios (Phases 1 and 2) – Advanced scenarios (Phases 2 and 3) • Specifications – Shim DIF over 802.1Q – PDU Forwarding Table Generator – Y2 plans • Software development – – – – High level software architecture User-space Kernel-space Wrap-up • Experimental activities – – – – Intro, goals, Y1 experimentation use case Testbed and results at i2CAT OFELIA island Testbed and results at iMinds OFELIA island Conclusions 92
  • 82. IRATI EXPERIMENTATION GOALS Investigating RINA as an Alternative to TCP/IP 93
  • 84. IRATI experimentation in a nutshell Phase I Phase III Phase II PSOC OFELIA OFELIA iLab.t iLab.t iLab.t EXPERI MENTA OFELIA EXPERI MENTA Investigating RINA as an Alternative to TCP/IP OFELIA EXPERI MENTA 95
  • 85. PROTOTYPE STATUS AND TOOLS Investigating RINA as an Alternative to TCP/IP 96
  • 86. Available Tools • Rinaband RINABand 1 RINABandClient 1 Data Contr – Test application for RINA AE ol AE – Java (user space) – Requires multiple flows between to Api’s 1 control flow N data flows Contr ol AE Data AE DIF • Echoserver/client – test parameters number and size of SDUs to be sent – Ping-like operation – The test completes when either all the SDUs have been sent and received, or when more than a certain interval of time elapses without receiving an SDU. – client and server report statistics • the number of transmitted and received SDUs • time the test lasted. – Single flow between two Api’s Investigating RINA as an Alternative to TCP/IP 97
  • 87. First Phase Prototype capabilities • Capabilities – Decision to focus on the Shim- ETH-VLAN – Supports only a single flow between two APi’s Preamble MAC dest MAC src 802.1q header (optional) Ethertype Payload FCS Interframe gap 7 bytes 6 bytes 6 bytes 4 bytes 2 bytes 42-1500 bytes 4 bytes 12 bytes • Impact on experiments – Could not use RinaBand – Rely on Echoserver/client application Investigating RINA as an Alternative to TCP/IP 98
  • 88. FIRST PHASE EXPERIMENTS Investigating RINA as an Alternative to TCP/IP 99
  • 89. First phase use case Investigating RINA as an Alternative to TCP/IP 100
  • 90. Single flow echo/bw test •Validate Stack / Prototype 1 •Validate Ethernet transparency •Measure goodput Investigating RINA as an Alternative to TCP/IP 101
  • 91. Multiple flow echo/bw validation •Validate multiple IPC processes •Measure goodput Investigating RINA as an Alternative to TCP/IP 102
  • 92. Concurrent RINA and IP •Validate concurrency IP and RINA stack •Measure goodput Investigating RINA as an Alternative to TCP/IP 103
  • 93. Presented by Leonardo Bergesio FIRST PHASE RESULTS @ I2CAT Investigating RINA as an Alternative to TCP/IP 104
  • 94. i2CAT OFELIA Island, EXPERIMENTA • Experiment == slice • FlowSpace: – Arbitrary Topology – Partition of the vectorial space of OF header fields – Slicing by VLANs • VMs to be used as end points or controllers • Perfect march: – SLICE  VLAN  Shim DIF over Ethernet Investigating RINA as an Alternative to TCP/IP 105
  • 95. Workflow I • Access island using OCF. Create or access your project/slice Investigating RINA as an Alternative to TCP/IP 106
  • 96. Workflow II • Select FlowSpace Topology and slice VLAN/s (DIFs) Investigating RINA as an Alternative to TCP/IP 107
  • 97. Workflow III • Create VMs  Nodes and OpenFlow Controller Investigating RINA as an Alternative to TCP/IP 108
  • 98. Resources Mapping SlicewithtwoVLANsids, one per DIF: 300, 301 Investigating RINA as an Alternative to TCP/IP 109
  • 99. Single flow Packets are sent over the Ethernet/VLAN bridge Goodput roughly 60% of Link capacity (iperf tested) Investigating RINA as an Alternative to TCP/IP Project: IRATIbasicusecase Slice: multivlanslice 111
  • 100. Multiple flows Flows to shared server (B & C to D)achieved half the throughput than the single flow (A to B) Investigating RINA as an Alternative to TCP/IP Project: IRATIbasicusecase Slice: multivlanslice 112
  • 101. Concurrency between IP and RINA stack Project: IRATIbasicusecase Slice: multivlanslice UDP Time Interval 90s Nº of datagrams 554915 Data sent 778 MB BW 75.5 Mbps Investigating RINA as an Alternative to TCP/IP 113
  • 102. FIRST PHASE RESULTS @ IMINDS Investigating RINA as an Alternative to TCP/IP 114
  • 104. Virtual Wall: Topology Control 116
  • 105. Virtual Wall: Topology Control 117
  • 106. Virtual wall @ iMinds Investigating RINA as an Alternative to TCP/IP 118
  • 107. Emulab: architecture Internet Web/DB/SNMP emulab ArchitectureSwitch Mgmt Users PowerCntl Control Switch/Router Serial PC PC 168 Programmable “Patch Panel” p.119
  • 108. Emulab: programmable patch panel p. 120
  • 109. Workflow Experiment idea GUI Emulab runs the additional scripts from ns file ns script Hardware Mapping and swap in Investigating RINA as an Alternative to TCP/IP Additionalscripting 121
  • 110. Basic Experiment on iMinds island • Use a LAN for the VLAN bridge Investigating RINA as an Alternative to TCP/IP 122
  • 111. Single flow Packets are sent over the Ethernet/VLAN bridge Goodput roughly 60% Iperf bandwidth Investigating RINA as an Alternative to TCP/IP 123
  • 112. Multiple flows Investigating RINA as an Alternative to TCP/IP 124
  • 113. Concurrency between IP and RINA stack Start Echo Server UDP Investigating RINA as an Alternative to TCP/IP 125
  • 114. CONCLÚIDÍ Investigating RINA as an Alternative to TCP/IP 126
  • 115. Conclusions from phase I experimentation • • • • IRATI stack and Shim DIF are running ~60% goodput in comparison to iperf No major performance problems When running concurrently, the IRATI stack take precedence over the IP stack – our stack doesn't loose a packet from syscalls to devs-layer • ARP in Shim DIF should not reuse 0x0806 ETHERTYPE because of incompatibility with existing implementations • Registration to Shim-DIF over Ethernet should be explicit Investigating RINA as an Alternative to TCP/IP 127
  • 116. Thanks for your attention! Questions? Investigating RINA as an Alternative to TCP/IP

Hinweis der Redaktion

  1. The shim DIF over Ethernet wraps an Ethernet layer with the RINA API and presents it to the layer above as if it was a regular DIF (usually with restricted capabilities; very seldom current technologies provide a fully-formed layer). The only intended user of an Ethernet shim DIF is a normal IPC Process, as discussed in the shim DIF specification.
  2. Over* wideGarea* networks,* where* latency* characteristics* can* vary,* Interoute* makes* use* of*IP/MPLS* based* router* devices* in* order* to* forward* customer* traffic* to* common* wideGarea*destinations,*while*retaining*logical*separationCustomer* networks* are* separated* into* groups* of* sites.* All* sites* that* are* permitted* to*communicate*with* one*another* (typically*all* sites*within*a* single* customer* organisation)*are*associated*with*a*Virtual*Routing/Forwarding*(VRF)*table.*This*can*be*thought*of*as*a*portion*of* a* router’s* routing* table,* and* the* IPGlayer* equivalent* of* an* Ethernet* switch’s* VLANGaware*TCAM*forwarding*table.*The*VRF*retains*routing*information,*specific*to*a*single*customer,*and*a* single* service* provider* PE* (“ProviderGEdge”)* router* may* accommodate* a* large* number* of*VRFs,*each*one*associated*with*different*customers.By*encapsulating*customer*traffic*in*MPLS,*the*service*provider*is*able*to*take*several*benefits:*i)*the*removal*of*the*customer’s*own*routing*identifiers*from*influencing*forwarding*decisions*on* the* backbone* and* ii)* the* encapsulation* of* endGpoint* routing* information,* after* a* single*routing*table*lookup.**BGP* brokers* communications* between* customers’* private* networks* and* the* shared* service*provider* backbone* that* connects* them.* In* the* face* of* external* (customer)* link* compromise,*BGP* propagates* the* news* of* the* failure* to*all* other* routers*in* the* service* provider* network,*which* in* turn* inform* other* customer* routers.* This* allows* customer* site* routers* to* make*alternate*routing*decisions*to*complete*their*transfer.
  3. A shim DIF over Ethernet maps to a VLANThe DIF name is the VLAN nameThe shim DIF only supports on class of service: unreliableARP can be used to map upper layer IPC Process names to shim DIF addresses (MAC addresses)Spans a single Ethernet segment
  4. The librina package contains all the IRATI stack libraries that have been introduced to abstract from the user all the kernel interactions (such as syscalls and Netlink details). Librina provides its functionalities to user-space RINA programs via scripting language extensions or statically/dynamically linkable libraries (i.e. for C/C++ programs). Librina is more a framework/middleware than a library: it has its own memory model (explicit, no garbage collection), its execution model is event-driven and it uses concurrency mechanics (its own threads) to do part of its work. Rinad instead, contains the IPC Manager and IPC Process daemons as well as a testing application (RINABand). The IPC Manager is the core of IPC Management in the system, acting both as the manager of IPC Processes and a broker between applications and IPC Processes (enforcing access rights, mapping flow allocation or application registration requests to the right IPC Processes, etc.). IPC Process Daemons implement the layer management components of an IPC Process (enrollment, flow allocation, PDU Forwarding table generation or distributed resource allocation functions). For more details on the rationale behind this high-level architecture, interested readers might refer to the relevant sections in D2.1 [3]. Rinad also provides a couple of example/utility applications that serve two purposes: i) provide an example of how an application uses librina and ii) allow testing/experimentation with the IRATI stack by measuring some properties of the IPC service as perceived by the application (flow allocation time, goodput in terms of bytes read/write per second or mean delay)
  5. Model classes: These classes model objects that abstract different concepts related to the services provided by librina, such as: application names, flow specifications, RIB objects, neighbours and connections. Model classes contain information on the modelled objects, but do not provide operations to perform actions other than updating or reading the object’s state. Proxy classes: These classes model ‘active entities’ within librina, meaning that they provide operations to perform actions on these entities. These actions result in the invocation of librina internals either to send a Netlink message to another user spaceprocess or the kernel; or to invoke a system call. For instance, librina-application provides an ‘IPCManager’ proxy class that allows an application process to request the allocation or deallocation of flows to the IPC Manager Daemon. Another example can be found in the ‘IPC Process’ class available at librina-ipcmanager: this proxy class allows the IPC Manager daemon to invoke operations on the user-space or kernel components of an IPC Process. Event classes: librina is event-based. Invocation of proxy classes operations that cause the emission of a Netlink message return right away, without waiting for the Netlink message response. The response will be later obtained as one of the events received through the EventConsumer class. Event classes are the ones that encapsulate the information of the different events, discriminated by event type. Examples of events include results of flow allocation/deallocation operations or results of application registration/unregistration operations, just to name a few. EventProducer: This class allows librina users to access the events originated from the responses to the operations requested through the Proxy classes. The event producer provides blocking, non-blocking and time-bounded blocking operations to retrieve pending events. The librina core components process two types of inputs: operations invoked via Proxy classes at the API level or Netlink messages received via the Netlink socket bounded to librina – created at initialization time. Operations invoked via proxy classes can follow two processing paths that either result in the invocation of a system call or on the generation of a Netlink message. In the former case processing is very simple: invocations of proxy operations are mapped to system call wrappers that make the required system call to the kernel (such as readsdu, writesdu, createipcprocess or allocateportid). The latter case involves more processing, as explained in the following: Concurrency classes: Concurrency classes provide an object-oriented wrapper to the OS threading functionalities. It is internally used by librina, but also exposed to librina users in case they want to use it as a way of avoiding external dependencies or intermixing different threading libraries (as it is the case of the IPC Manager and IPC Process daemons). Message classes: These classes provide an object-oriented model of the different Netlink messages that can be sent or received by librina. The basic message class BaseNetlinkMessage’ models all the information required to generate/parse the header of a Netlink message, including the Netlink header (source port-id, destination port-id and sequence number), the Generic Netlink family header (family and operation-code) and the RINA family header (source and destination IPC Process ids). The different message classes extend the base class by modelling the information that is sent/received as Netlink message attributes in the different messages. NetlinkManager: This class provides an object-oriented wrapper of the functions available at the libnl/libgnl libraries (these libraries provide functions to generate, parse, send and receive Netlink messages). The wrapping is partial since only the functionality required by librina has been wrapped. In the ‘output path’ the NetlinkManager takes a message class, generates a buffer, adds the NL message header to the buffer, passes the message class and the buffer to the NL formatter classes (which will add NL attributes to the buffer) and finally passes the buffer to libnl to send the message. In the ‘input path’ – upon calling the blocking ‘getMessage’ operation – the IPC Manager blocks until libnl returns a buffer containing a NL message, then it parses the header, requests the NL parser classes to parse the NL attributes and return the appropriate message class, and returns. NetlinkMessage Parsers/Formatters: The goal of these classes is either to generate the attributes of a NL message based on the contents of a message class (formatting role) or to create and initialize a message class based on the attributes of a NL message (parsing role). In order to ensure that all the NL messages are received in a timely fashion, librina-core has an internal thread that is continuously calling the blocking NetlinkManager ‘getMessage’ operation. When the operation returns the thread converts the resulting Message class to an Event class, and puts the Event class to an internal events queue. When a librina user calls the EventConsumer to retrieve an event, the EventConsumer tries to retrieve an element from the events queue by invoking the eventPoll (non-blocking), eventWait (blocking) or eventTimedWait (blocking but time-bounded) operation. All librina components use an internal lightweight logging framework instead of an external one in order to minimize librina dependencies, since the goal is to facilitate deploying it within several OS/Linux systems.
  6. The IPC Manager Daemon is the main responsible for managing the RINA stack in the system. It manages the IPC Process lifecycle, acts as the local management agent for the system and is the broker between applications and IPC Processes (filtering the IPC resources available to the different applications in the system). As introduced in section 2.2.2 the first phase prototype of the IPC Manager has been developed in Java, leveraging part of the Alba prototype codebases. Moreover, the current IPC Manager Daemon is not a complete implementation, since it does not implement the local management agent yet (therefore the RINA stack cannot be managed through a centralized DIF Management System).The IPC Process Daemon performs the layer management functions of a single IPC Process. It is therefore “half” of the IPC Process application, while the other half – dealing with data-transfer and data-transfer control related tasks - is located at the kernel. Layer management operations are more complex and do not have such stringent performance requirements as data transfer operations, therefore locating them at user-space is a logical choice, as introduced in D2.1.
  7. Figure 20 shows a schema of the detailed IPC Manager Daemon software design. It is a Java OS process that leverages the operations provided by the librina API through the wrappers generated by SWIG and the Java Native Interface (JNI). In concrete, librina-ipc-manager provides the following proxy classes to the IPC Manager Daemon: IPC Process Factory. Enables the creation, destruction and enumeration of the different types of IPC Processes supported by the system. IPC Process. Allows the IPC Manager to request operations to IPC Processes such as assignment to DIFs, configuration updates, enrolment, registrations of applications or allocations/deallocations of flows. Application Manager. Provides operations to inform applications about the results of pending requests such as allocation of flows or registrations of applications. When the IPC Manager Daemon initializes it reads a configuration file from a well-known location. This configuration file provides default values for system parameters, describes configurations of well-known DIFs and controls the behaviour of the IPC Manager bootstrap process. The latter is achieved by specifying: The IPC Processes that have to be created at system start-up, including their name and type. For each IPC Process to be created, the names of the N-1 DIFs where the IPC Process has to be registered (if any). For each IPC Process to be created, the name of the DIF that the IPC Process is a member of (if any). If the IPC Process is assigned to a DIF it will be initialized with an address and all the other information required to start operating as a member of that DIF (DIF-wide constants, policies, credentials, etc.) When the bootstrapping phase is over the IPC Manager main thread starts executing the event loop forever. The event loop continuously polls librina’sEventProducer (in blocking mode) to get the events resulting from Netlink request messages sent by applications or IPC Processes. When and event happens, the event loop checks its type and delegates the processing of the event to one of the specialized core classes: Flow Manager (flow related events), Application Registration Manager (application-registration related events) or IPC Process Manager (IPC Process lifecycle management related events). The processing performed by these core classes will typically result in the invocation of one of the operations provided by the librina-ipc-process Proxy classes previously described in this section. Local system administrators can interact with the IPC Manager through a Command Line Interface (CLI), accessible via telnet. This console provides a number of commands that allow system administrators to query the status of the RINA stack in the system, as well as performing actions that modify its configuration (such as creating/destroying IPC Processes, assigning them to DIFs, etc.). The IPC Manager supports the CLI console through a dedicated thread that listens at the console port; only one console session at a time is supported at the moment. The current IPC Manager has leveraged the following Alba components, adapting them to the environment of the IRATI stack: Configuration file format, parsing libraries and model classes (the configuration file uses JSON – the JavaScript Object Notation). Command Line Interface Server Thread and related parsing classes. Bootstrapping process.
  8. Figure 20 shows a schema of the detailed IPC Manager Daemon software design. It is a Java OS process that leverages the operations provided by the librina API through the wrappers generated by SWIG and the Java Native Interface (JNI). In concrete, librina-ipc-manager provides the following proxy classes to the IPC Manager Daemon: IPC Process Factory. Enables the creation, destruction and enumeration of the different types of IPC Processes supported by the system. IPC Process. Allows the IPC Manager to request operations to IPC Processes such as assignment to DIFs, configuration updates, enrolment, registrations of applications or allocations/deallocations of flows. Application Manager. Provides operations to inform applications about the results of pending requests such as allocation of flows or registrations of applications. When the IPC Manager Daemon initializes it reads a configuration file from a well-known location. This configuration file provides default values for system parameters, describes configurations of well-known DIFs and controls the behaviour of the IPC Manager bootstrap process. The latter is achieved by specifying: The IPC Processes that have to be created at system start-up, including their name and type. For each IPC Process to be created, the names of the N-1 DIFs where the IPC Process has to be registered (if any). For each IPC Process to be created, the name of the DIF that the IPC Process is a member of (if any). If the IPC Process is assigned to a DIF it will be initialized with an address and all the other information required to start operating as a member of that DIF (DIF-wide constants, policies, credentials, etc.) When the bootstrapping phase is over the IPC Manager main thread starts executing the event loop forever. The event loop continuously polls librina’sEventProducer (in blocking mode) to get the events resulting from Netlink request messages sent by applications or IPC Processes. When and event happens, the event loop checks its type and delegates the processing of the event to one of the specialized core classes: Flow Manager (flow related events), Application Registration Manager (application-registration related events) or IPC Process Manager (IPC Process lifecycle management related events). The processing performed by these core classes will typically result in the invocation of one of the operations provided by the librina-ipc-process Proxy classes previously described in this section. Local system administrators can interact with the IPC Manager through a Command Line Interface (CLI), accessible via telnet. This console provides a number of commands that allow system administrators to query the status of the RINA stack in the system, as well as performing actions that modify its configuration (such as creating/destroying IPC Processes, assigning them to DIFs, etc.). The IPC Manager supports the CLI console through a dedicated thread that listens at the console port; only one console session at a time is supported at the moment. The current IPC Manager has leveraged the following Alba components, adapting them to the environment of the IRATI stack: Configuration file format, parsing libraries and model classes (the configuration file uses JSON – the JavaScript Object Notation). Command Line Interface Server Thread and related parsing classes. Bootstrapping process.
  9. Figure 21 depicts the detailed software design of the IPC Process Daemon. The first phase prototype follows the same approach taken with the IPC Manager Daemon design and implementation: leveraging the Alba stack as much as possible in order to provide a simple but complete enough implementation of the IPC Process Daemon. Therefore the IPC Process Daemon is also a Java OS process that builds on the APIs exposed by librina through SWIG and JNI. The librina proxy classes described below are the more relevant to the IPC Process Daemon operation: IPC Manager. Allows the IPC Process Daemon to communicate with the IPC Manager Daemon, mainly to inform the latter about the results of requested operations; but also to notify about incoming flow requests or flows that have been deallocated. Kernel IPC Process. Provides operations to enable the IPC Process Daemon to communicate with the data-transfer/data-transfer-control related functions of the IPC Process in the kernel. The APIs allow the IPC Process Daemon to modify the kernel IPC Process configuration, to manage the setup and teardown EFCP connections or to modify the PDU forwarding table. IPC Process Daemons are instantiated and destroyed by the IPC Manager Daemon. When the IPC Process Daemon has completed is initialization, the main thread starts executing the event loop. Such a loop is implemented by continuously polling the EventProducer for new events (in blocking mode) and processing them when they arrive. The event processing is delegated to the classes implementing the different layer management functions: Enrollment Task, Resource Allocator, Registration Manager, Flow Allocator and PDU Forwarding Table Generator. Processing performed by these classes typically involves two types of actions: Local actions resulting in communications with the Kernel IPC Process or the IPC Manager, achieved via the librina proxy classes. Remote actions resulting in communications with peer IPC Process Daemons, achieved via the RIB Daemon. The RIB Daemon is an internal component of the IPC Process that provides an abstract, object- oriented schema of all the IPC Process state information. This schema, known as the Resource Information Base or RIB, allows IPC Processes to modify the state of their peers by performing operations on one or more of the RIB objects. The Common Distributed Application Protocol (CDAP) is the application protocol used to exchange the remote RIB operation requests and responses between peer IPC Processes. This protocol allows six remote operations to be performed over RIB objects: create, delete, read, write, start and stop. The objects that are the target of the operation are identified by the following attributes: Object class. Uniquely identifies a certain type of objects. Object name. Uniquely identifies the instance of an object of a certain class. The object class + object name tuple uniquely identify an object within the RIB. Object instance. A shorthand for object class + object name, to uniquely identify an object within the RIB. Scope. Indicates the number of ‘levels’ of the RIB affected by the operation, starting at the specified object (object class + name or instance). This allows a single operation to target multiple objects at once. Filter. Provides a predicate that evaluates to ‘true’ or ‘false’ based on the value of the object attributes. This allows further discriminating to what objects the operation has to be applied. More information about the RIB, RIB Daemon and CDAP can be found at D2.1 [3]. CDAP is implemented as a library that provides a CDAPSessionManager class that manages one or more CDAP sessions. The CDAPSession class implements the logics of the CDAP Protocol state machine as defined in the CDAP specification [33]. CDAP can be encoded in multiple ways, but the IRATI stack follows the approach adopted by the other current RINA protocols to use Google Protocol Buffers (GPB) [31]. This decision will make interoperability possible, and will also provide the benefits of GPB: efficient encoding; proven, mature and scalable technology with good quality open source parsers/generators available. In addition to the information of the operation as well as the identity of the targeted objects, CDAP messages can also transport the actual values of such objects. Therefore the object values also need to be encoded in binary format. Again, GPB is the initial encoding format chosen, although others are also possible (ASN.1, XML, JSON, etc). Object encoding functionalities are implemented by the Encoding support library, which provides an encoding-format-neutral interface. Thus it allows for several encoding implementations to be plugged in/out, specifying which one to use at configuration time. The RIB is implemented as a map of object managers, indexed by object names – current RINA implementations have adopted the convention of making object names unique within the RIB as a simplifying assumption. Each object manager wraps a piece of state information (for example Flows, Application Registrations, QoS Cubes, the PDU Forwarding Table, etc) with the RIBObject interface. This interface abstracts the six operations provided by CDAP: create, delete, read, write, start and stop. When a remote CDAP message reaches the IPC Process Daemon, the message is handled to the RIB Daemon component. The RIB Daemon retrieves the object manager associated to the targeted object name from the RIB map, and invokes the requested CDAP operation. The goal of the object manager is to translate each CDAP operation to the appropriate actions on the layer management function classes. The layer management function classes use the RIB Daemon when they have to invoke a remote operation to a peer IPC Process. The RIB Daemon provides operations to send CDAP messages to neighbour IPC Processes based on its application process name. When such operations are called, the RIB Daemon internally fetches the port-id of the underlying N-1 flow that allows the IPC Process to communicate with the given neighbour, encodes the CDAP message and requests the kernel to write the encoded CDAP message as an SDU to that N-1 flow.
  10. Figure 21 depicts the detailed software design of the IPC Process Daemon. The first phase prototype follows the same approach taken with the IPC Manager Daemon design and implementation: leveraging the Alba stack as much as possible in order to provide a simple but complete enough implementation of the IPC Process Daemon. Therefore the IPC Process Daemon is also a Java OS process that builds on the APIs exposed by librina through SWIG and JNI. The librina proxy classes described below are the more relevant to the IPC Process Daemon operation: IPC Manager. Allows the IPC Process Daemon to communicate with the IPC Manager Daemon, mainly to inform the latter about the results of requested operations; but also to notify about incoming flow requests or flows that have been deallocated. Kernel IPC Process. Provides operations to enable the IPC Process Daemon to communicate with the data-transfer/data-transfer-control related functions of the IPC Process in the kernel. The APIs allow the IPC Process Daemon to modify the kernel IPC Process configuration, to manage the setup and teardown EFCP connections or to modify the PDU forwarding table. IPC Process Daemons are instantiated and destroyed by the IPC Manager Daemon. When the IPC Process Daemon has completed is initialization, the main thread starts executing the event loop. Such a loop is implemented by continuously polling the EventProducer for new events (in blocking mode) and processing them when they arrive. The event processing is delegated to the classes implementing the different layer management functions: Enrollment Task, Resource Allocator, Registration Manager, Flow Allocator and PDU Forwarding Table Generator. Processing performed by these classes typically involves two types of actions: Local actions resulting in communications with the Kernel IPC Process or the IPC Manager, achieved via the librina proxy classes. Remote actions resulting in communications with peer IPC Process Daemons, achieved via the RIB Daemon. The RIB Daemon is an internal component of the IPC Process that provides an abstract, object- oriented schema of all the IPC Process state information. This schema, known as the Resource Information Base or RIB, allows IPC Processes to modify the state of their peers by performing operations on one or more of the RIB objects. The Common Distributed Application Protocol (CDAP) is the application protocol used to exchange the remote RIB operation requests and responses between peer IPC Processes. This protocol allows six remote operations to be performed over RIB objects: create, delete, read, write, start and stop. The objects that are the target of the operation are identified by the following attributes: Object class. Uniquely identifies a certain type of objects. Object name. Uniquely identifies the instance of an object of a certain class. The object class + object name tuple uniquely identify an object within the RIB. Object instance. A shorthand for object class + object name, to uniquely identify an object within the RIB. Scope. Indicates the number of ‘levels’ of the RIB affected by the operation, starting at the specified object (object class + name or instance). This allows a single operation to target multiple objects at once. Filter. Provides a predicate that evaluates to ‘true’ or ‘false’ based on the value of the object attributes. This allows further discriminating to what objects the operation has to be applied. More information about the RIB, RIB Daemon and CDAP can be found at D2.1 [3]. CDAP is implemented as a library that provides a CDAPSessionManager class that manages one or more CDAP sessions. The CDAPSession class implements the logics of the CDAP Protocol state machine as defined in the CDAP specification [33]. CDAP can be encoded in multiple ways, but the IRATI stack follows the approach adopted by the other current RINA protocols to use Google Protocol Buffers (GPB) [31]. This decision will make interoperability possible, and will also provide the benefits of GPB: efficient encoding; proven, mature and scalable technology with good quality open source parsers/generators available. In addition to the information of the operation as well as the identity of the targeted objects, CDAP messages can also transport the actual values of such objects. Therefore the object values also need to be encoded in binary format. Again, GPB is the initial encoding format chosen, although others are also possible (ASN.1, XML, JSON, etc). Object encoding functionalities are implemented by the Encoding support library, which provides an encoding-format-neutral interface. Thus it allows for several encoding implementations to be plugged in/out, specifying which one to use at configuration time. The RIB is implemented as a map of object managers, indexed by object names – current RINA implementations have adopted the convention of making object names unique within the RIB as a simplifying assumption. Each object manager wraps a piece of state information (for example Flows, Application Registrations, QoS Cubes, the PDU Forwarding Table, etc) with the RIBObject interface. This interface abstracts the six operations provided by CDAP: create, delete, read, write, start and stop. When a remote CDAP message reaches the IPC Process Daemon, the message is handled to the RIB Daemon component. The RIB Daemon retrieves the object manager associated to the targeted object name from the RIB map, and invokes the requested CDAP operation. The goal of the object manager is to translate each CDAP operation to the appropriate actions on the layer management function classes. The layer management function classes use the RIB Daemon when they have to invoke a remote operation to a peer IPC Process. The RIB Daemon provides operations to send CDAP messages to neighbour IPC Processes based on its application process name. When such operations are called, the RIB Daemon internally fetches the port-id of the underlying N-1 flow that allows the IPC Process to communicate with the given neighbour, encodes the CDAP message and requests the kernel to write the encoded CDAP message as an SDU to that N-1 flow.