SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Ferrara, Tuesday, January 27, 2015
Corso di Dottorato in Matematica e Informatica
Università degli studi di Ferrara
2nd Year PhD Activities Report (2009)
Alfredo Pagano
Advisor: Prof. Eleonora Luppi
Coadvisor: Dr. Mario Reale
1
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
2
Present employment
Sysadmin in charge of installing, supporting, and maintaining servers
and core services at GARR
The aim of Consortium GARR is to plan, manage, and operate the
Italian National Research and Education Network,
implementing the most advanced technical solutions and services.
Networking Support Activity (EGEE-III SA2)
Enabling Grids for E-sciencE (EGEE) is Europe's leading grid
computing project, providing a computing support infrastructure for over
10,000 researchers world-wide, from fields as diverse as high energy
physics, earth and life sciences.
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
3
GARR-G: Backbone Physical Infrastructure
GARR-G: PoP-level topology
 About 45 GARR PoP
 90% in Univ. research institutions
premises
 other in telco operators and Internet
eXchange premises
 About 60 backbone links
 Mainly leased lines from 8 telco
operators
 Core links
 10 Gbps (STM-64) and 2.5 Gbps (STM-
16)
 Edge links
 34 Mbps, 155 Mbps and 622 Mbps
 Peering links
 10 Gbps (STM-64 and
10GigabitEthernet), 2.5 Gbps (STM-16)
and 1 Gbps (1 GigabitEthernet)
 Backbone capacity 120 Gbps
 Peering capacity 40 Gbps
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
4
GARR-G: Backbone Logical Infrastructure
 Basically each IP backbone link
corresponds to a leased lines from
different operators
 GARR-G is an IP multivendor
network
 Juniper M320, M20 and M10i
 Cisco 12000, 7500 and 7200
20
RT.FI1
RT.TO1
RT.MI2
GEANT
TELIA
20
5
9
10
11
20
10
20
MIX
GlobalX
RC.GE1
RC.TS1
14
30
50
1
1
1
RT1.BO1
RT1.MI1
RT.MI3
18
1
RC.TN1
19
RT.BA1
10
RC.MTRC.SA RC.CB
111
1 1
1
RC.LE
RC.FG
RC.PZ
Juniper
M320
Juniper
M20
Cisco
7500
Telecom Infracom Interoute Fastweb Wind D.F/ Peering
9
RT.CT1
RC.ME
RC.CS
1
1
20
RC.PA1
20
RT.NA1
19
RC.PG
RC.AQ
RC.AN
111
RT.RM1
RC.CA
RC.SS
1
1
RT.RM2
Level 3
20
1
RC.RM2
1
NaMeX
RC.FRA
20
1
RT.PI1
RC.FI
19
4
RC.VE
RT.PD1
1
25
20
Cisco
7200
1
1
50
RC.AQ1
100 1
2.5GB
2.5GB
2.5GB
1 + 2 GB
10 GB
2.5GB
10GB
2.5GB
155MB
2X
155
2.5GB
155MB
2.5GB
10GB
2.5GB
10 GB 2 X 10 GB
155 MB
155 MB
2.5GB
2.5GB
34MB
2.5GB
2.5GB
10 GB
155 MB
155 MB
155MB
155 M
B
2X155
1 + 2.5 GB
1+2.5GB
10 + 1 GB
2.5GB
2X
155
2.5GB
1GB
2.5GB
155 MB
2X155MB
155MB
155 MB
2 X 155 MB
2.5GB
155MB
155MB
2.5GB
2.5GB
2 X 155
2 X
155
155MB
155MB
RC.CA1
Juniper
M10i
155 MB
1
RC.PV
2X155MB
1
Juniper
M7i
1
155 MB
RC.UR
20
RT.FI1
RT.TO1
RT.MI2
GEANT
TELIA
20
5
9
10
11
20
10
20
MIX
GlobalX
RC.GE1
RC.TS1
14
30
50
1
1
1
RT1.BO1
RT1.MI1
RT.MI3
18
1
RC.TN1
19
RT.BA1
10
RC.MTRC.SA RC.CB
111
1 1
1
RC.LE
RC.FG
RC.PZ
Juniper
M320
Juniper
M20
Cisco
7500
Telecom Infracom Interoute Fastweb Wind D.F/ Peering
9
RT.CT1
RC.ME
RC.CS
1
1
20
RC.PA1
20
RT.NA1
19
RC.PG
RC.AQ
RC.AN
111
RT.RM1
RC.CA
RC.SS
1
1
RT.RM2
Level 3
20
1
RC.RM2
1
NaMeX
RC.FRA
20
1
RT.PI1
RC.FI
19
4
RC.VE
RT.PD1
1
25
20
Cisco
7200
1
1
50
RC.AQ1
100 1
2.5GB
2.5GB
2.5GB
1 + 2 GB
10 GB
2.5GB
10GB
2.5GB
155MB
2X
155
2.5GB
155MB
2.5GB
10GB
2.5GB
10 GB 2 X 10 GB
155 MB
155 MB
2.5GB
2.5GB
34MB
2.5GB
2.5GB
10 GB
155 MB
155 MB
155MB
155 M
B
2X155
1 + 2.5 GB
1+2.5GB
10 + 1 GB
2.5GB
2X
155
2.5GB
1GB
2.5GB
155 MB
2X155MB
155MB
155 MB
2 X 155 MB
2.5GB
155MB
155MB
2.5GB
2.5GB
2 X 155
2 X
155
155MB
155MB
RC.CA1
Juniper
M10i
155 MB
1
RC.PV
2X155MB
1
Juniper
M7i
1
155 MB
RC.UR
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
5
 PhD subject
 Motivation, Mission and Scope
 The idea: pros and cons
 Metrics collected
 First results
 Next steps
5
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
6
PhD subject
 1st
year activity - The research activity was
focused on exploiting and testing network
monitoring tools to gather relevant metrics
related to end-to-end performances in Grid
infrastructures.
 2nd
year activity – Design and create a first
working version of a Grid Network monitoring
tool based on Grid jobs
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
7
Motivation (1/2)
 Debugging networks for efficiency
 an essential step for those wishing to run data intensive
applications
 Optimizing performances of Grid middleware and
applications to make intelligent use of the network
 adapting to changing network conditions
 Supporting the Grid “utility computing” model
 ensuring that the differing network performances required by
particular Grid applications are provided, via measurable SLAs
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
8
Motivation (2/2)
 An help for Site and Grid operations
 Help diagnose performance problems among sites
 This transfer is slow, what’s broken? – the network, the server, the middleware…
 I can’t see site X, has the network gone down or is it just a particular service or
machine?
 My application’s performance varies with time of day – is there a network bottleneck?
 Help diagnose problems within sites
 Most network problems, especially performance issues, are not backbone related, they
are in the “last mile”
 Help with planning and provisioning decisions
 Is an SLA I’ve arranged being adhered to by my providers?
 For Grid services and middleware
 I want to increase the performance of file transfers between sites
 I want to know which compute site is “closest” to my data to submit a job to it
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
9
The idea: pros and cons
Instead of installing a probe at each site,
run a Grid Job!
 Added value:
 No installation needed at the sites
 Monitoring system running on a proven system (the grid) & possibility to
use grid services
 Direct use of grid AuthN and AuthZ
 Limits:
 The job is not running with root privileges on the Worker Node (WN)
 Some low level operations are not permitted
 Heterogeneity of the WN environments (OS, 32/64 bits, etc.)
 Ex: making the job download-and-run an external binary may be tricky
(except if they are written in an OS independent programming language)
 The system has to deal with the grid mechanism overhead (delays…)
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
10
The system in action
Site paris-urec-ipv6
UI
Central monitoring
server program (CMSP)
Site A
WN
Job
Site X
WMS
CE
Site B
WN
Job
CE
Site C
WN
Job
CE
Job submission
Socket connection
Ready!
Probe Request
Request:
RTT test to site A
Request:
RTT test to site A
Request:
BW test to site B
Request:
BW test to site B
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
11
Some remarks
 Chosen design is more efficient than starting a job for each
probe (considering delays)
 TCP connection is initiated by the job
 No open port needed on the WN -> better for sites security
 An authentication mechanism is implemented between the job
and the server
 High scalability (Bend and Fend can be easily decoupled)
 A job cannot run forever (GlueCEPolicyMaxWallClockTime)
 there are two jobs running at each site
 A ‘main’ one
 A ‘redundant’ one which is waiting and will become ‘main’ when the
other one ends
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
12
Round-Trip time, MTU and hop count tests
Site paris-urec-ipv6
UI
Central monitoring
server program (CMSP)
Site B
WN
Job
Site C
CE
Socket connection Probe Request
Request:
RTT test to site C
Request:
RTT test to site C
Probe Result
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
13
Round-Trip time, MTU and hop count tests
 The ‘RTT’ measure is the time a TCP ‘connect()’ function call takes:
 Because a connect() call involves a round-trip of packets:
 SYN ->
 SYN-ACK <-
 ACK ->
 Results similar to the ones of ‘ping’
 The MTU is given by the IP_MTU socket option
 The number of hops is calculated in an iterative way
 All these measures require:
 To connect to an accessible port (1) on a machine of the remote site
 To close the connection (no data is sent)
 Note: This (connect/disconnect) is detected in the application log
 (1): We use the port of the gatekeeper of the CE since it is known to be
accessible (it is used by gLite)
Round trip
Just sending => no network delay
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
14
WN-to-WN BW test (may be obsoleted)
Site paris-urec-ipv6
UI
Central monitoring
server program (CMSP)
Site A
WN
Job
Site C
WN
Job
Probe Request
Request:
BW test to wn-site-C:<p>
Request:
BW test to wn-site-C:<p>
Request:
Open a TCP port <p>
Request:
Open a TCP port <p>
Socket connection
Sending a big
amount of data
Probe Result
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
15
GridFTP BW test
Site paris-urec-ipv6
UI
Central monitoring
server program (CMSP)
Site A
WN
Job
Site C
Probe Request
Request:
GridFTP BW test to site
C
Request:
GridFTP BW test to site
C
Socket connection
SE SE
Replication of
a big grid file
Read the
gridFTP log file
Probe Result
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
16
WN-to-WN BW test (under discussion)
 It requires the remote site to allow incoming TCP
connections to the WN
 Not a best practice security policy
 Not always possible (WNs behind a NAT)
 Workaround are sometimes possible
 WN WN transfers doesn’t reflect real use case
 The WN network connectivity may not be adapted
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
17
Metrics collected & scheduling
 Latency test
 Ping
 Every 5 minutes
 Hop list
 Traceroute
 Every 5 minutes
 MTU size
 Socket (IP_MTU socket option)
 Every 5 minutes
 Achievable Bandwidth
 TCP throughput transfer via GridFTP transfer between 2 Storage Element
 Every 8h
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
18
8 Sites involved
A. Paris Urec CNRS
B. IN2P3 Lyon
C. INFN-CNAF
D. INFN-ROMA1
E. INFN-ROMA-CMS
F. GRISU-ENEA-GRID
G. INFN-BARI
H. INFN-CATANIA
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
19
Traceroute Paris-Catania
 [pagano@ui-ipv6-testbed ~]$ traceroute grid005.ct.infn.it
traceroute to grid005.ct.infn.it (193.206.208.18), 30 hops max, 38 byte packets
1 194.57.137.190 (194.57.137.190) 1.589 ms 1.479 ms 2.696 ms
2 r-interco-urec.reseau.jussieu.fr (134.157.247.38) 0.331 ms 0.273ms 0.348 ms
3 r-jusrap-reel.reseau.jussieu.fr (134.157.254.124) 0.368 ms 0.320ms 0.318 ms
4 interco-6.01-jussieu.rap.prd.fr (195.221.127.181) 0.258 ms 0.330ms 0.255 ms
5 * * *
6 te1-2-paris1-rtr-021.noc.renater.fr (193.51.189.230) 1.224 ms1.127 ms 1.121 ms
MPLS Label=489 CoS=6 TTL=1 S=0
7 te0-0-0-3-paris1-rtr-001.noc.renater.fr (193.51.189.37) 1.182 ms1.298 ms 1.150 ms
12 rt1-mi1-rt-mi2.mi2.garr.net (193.206.134.190) 17.555 ms 17.533ms 17.646 ms
13 rt-mi2-rt-rm2.rm2.garr.net (193.206.134.230) 26.996 ms 27.071 ms 27.183 ms
14 rt-rm2-rt-rm1-l1.rm1.garr.net (193.206.134.117) 27.050 ms 27.160ms 27.062 ms
15 rt-rm1-rt-ct1.ct1.garr.net (193.206.134.6) 44.854 ms 44.882 ms 44.820 ms
16 rt-ct1-ru-infngrid.ct1.garr.net (193.206.137.186) 45.115 ms 45.013 ms 45.009 ms
17 grid005.ct.infn.it (193.206.208.18) 45.014 ms 44.967 ms 44.913 m
8 renater.rt1.par.fr.geant2.net (62.40.124.69) 1.154 ms 1.153 ms 1.128 ms
9 so-7-3-0.rt1.gen.ch.geant2.net (62.40.112.29) 9.950 ms 9.958 ms 10.018 ms
10 so-3-3-0.rt1.mil.it.geant2.net (62.40.112.210) 17.264 ms 17.314ms 17.372 ms
11 garr-gw.rt1.mil.it.geant2.net (62.40.124.130) 17.369 ms 21.514ms 17.368 ms
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
20
Frontend view:
Ldap Authentication, based on Google Web Toolkit (GWT) framework
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
21
Next steps:
1. Triggering system to alert site and network admins
2. Frontend improvements (plotting graphs)
3. Not only scheduled, but also on-demand measurements
…
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
22
Thank you for your attention!
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
23
Backup
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
24
GridFTP BW test
 This test shows good results
 If the GridFTP log file is not accessible (cf. dCache?)
 We just do the transfer via globus-url-copy and measure the
time it takes
 This is slightly less precise
 How many streams should we request in the command line?
globus-url-copy –p <num_streams> […]
Ferrara,
Tuesday, January 27, 2015
Alfredo Pagano
25
Network Performance Factors
 End System Issues
 Network Interface Card and Driver
and their configuration
 TCP and its configuration
 Operating System and its
configuration
 Disk System
 Processor speed
 Bus speed and capability
 Application eg old versions of scp
 Network Infrastructure Issues
 Obsolete network equipment
 Configured bandwidth restrictions
 Topology
 Security restrictions (e.g., firewalls)
 Sub-optimal routing
 Transport Protocols
 Network Capacity and the
influence of Others!
 Many, many TCP connections
 Congestion

Weitere ähnliche Inhalte

Was ist angesagt?

Jorgenson Loki
Jorgenson LokiJorgenson Loki
Jorgenson Loki
Carl Ford
 
Basic ntp configuration
Basic ntp configurationBasic ntp configuration
Basic ntp configuration
Raghu nath
 

Was ist angesagt? (7)

Jorgenson Loki
Jorgenson LokiJorgenson Loki
Jorgenson Loki
 
Sdn command line controller lab
Sdn command line controller labSdn command line controller lab
Sdn command line controller lab
 
Basic ntp configuration
Basic ntp configurationBasic ntp configuration
Basic ntp configuration
 
OSMC 2009 | Monitoring and IPv6 by Benedikt Stockebrandt
OSMC 2009 |  Monitoring and IPv6 by Benedikt StockebrandtOSMC 2009 |  Monitoring and IPv6 by Benedikt Stockebrandt
OSMC 2009 | Monitoring and IPv6 by Benedikt Stockebrandt
 
Using GTP on Linux with libgtpnl
Using GTP on Linux with libgtpnlUsing GTP on Linux with libgtpnl
Using GTP on Linux with libgtpnl
 
IPv4 over IPv6 in the Venue, APRICOT-APAN 2015 Fukuoka
IPv4 over IPv6 in the Venue, APRICOT-APAN 2015 FukuokaIPv4 over IPv6 in the Venue, APRICOT-APAN 2015 Fukuoka
IPv4 over IPv6 in the Venue, APRICOT-APAN 2015 Fukuoka
 
01/24/17 Raytheon IIS CO Open House
01/24/17 Raytheon IIS CO Open House01/24/17 Raytheon IIS CO Open House
01/24/17 Raytheon IIS CO Open House
 

Andere mochten auch (6)

2010 11-22 strategi for tilstedeværelse i sosiale medier v1-0 - slidesharever...
2010 11-22 strategi for tilstedeværelse i sosiale medier v1-0 - slidesharever...2010 11-22 strategi for tilstedeværelse i sosiale medier v1-0 - slidesharever...
2010 11-22 strategi for tilstedeværelse i sosiale medier v1-0 - slidesharever...
 
3.1 the use of fish as ecological indicators en
3.1 the use of fish as ecological indicators en3.1 the use of fish as ecological indicators en
3.1 the use of fish as ecological indicators en
 
Social issues dp project 11 18-10 photos by Josie
Social issues dp project 11 18-10 photos by JosieSocial issues dp project 11 18-10 photos by Josie
Social issues dp project 11 18-10 photos by Josie
 
Ciclismo
CiclismoCiclismo
Ciclismo
 
Hot technologies
Hot technologiesHot technologies
Hot technologies
 
Indicator role and monitoring of microorganisms in life [autosaved]
Indicator role and monitoring of microorganisms in life [autosaved]Indicator role and monitoring of microorganisms in life [autosaved]
Indicator role and monitoring of microorganisms in life [autosaved]
 

Ähnlich wie Alfredo paganophd 3y

Performance Aware SDN, LSPE talk
Performance Aware SDN, LSPE talkPerformance Aware SDN, LSPE talk
Performance Aware SDN, LSPE talk
netvis
 
RAO ABDUL KHALIQ-Probation Presentations-V1.0.pptx
RAO ABDUL KHALIQ-Probation Presentations-V1.0.pptxRAO ABDUL KHALIQ-Probation Presentations-V1.0.pptx
RAO ABDUL KHALIQ-Probation Presentations-V1.0.pptx
MuhammadShahFaisal1
 

Ähnlich wie Alfredo paganophd 3y (20)

draft-georgescu-bmwg-ipv6-tran-tech-benchmarking-00
draft-georgescu-bmwg-ipv6-tran-tech-benchmarking-00draft-georgescu-bmwg-ipv6-tran-tech-benchmarking-00
draft-georgescu-bmwg-ipv6-tran-tech-benchmarking-00
 
Performance Aware SDN, LSPE talk
Performance Aware SDN, LSPE talkPerformance Aware SDN, LSPE talk
Performance Aware SDN, LSPE talk
 
Enabling Active Flow Manipulation In Silicon-based Network Forwarding Engines
Enabling Active Flow Manipulation In Silicon-based Network Forwarding EnginesEnabling Active Flow Manipulation In Silicon-based Network Forwarding Engines
Enabling Active Flow Manipulation In Silicon-based Network Forwarding Engines
 
Final project report
Final project reportFinal project report
Final project report
 
ASCC Network Experience in IPv6
ASCC Network Experience in IPv6ASCC Network Experience in IPv6
ASCC Network Experience in IPv6
 
Next Steps in the SDN/OpenFlow Network Innovation
Next Steps in the SDN/OpenFlow Network InnovationNext Steps in the SDN/OpenFlow Network Innovation
Next Steps in the SDN/OpenFlow Network Innovation
 
C08 – Updated planning and commissioning guidelines for Profinet - Xaver Sch...
C08 – Updated planning and commissioning guidelines for Profinet -  Xaver Sch...C08 – Updated planning and commissioning guidelines for Profinet -  Xaver Sch...
C08 – Updated planning and commissioning guidelines for Profinet - Xaver Sch...
 
CCNA 200-120 Exam Questions
CCNA 200-120 Exam QuestionsCCNA 200-120 Exam Questions
CCNA 200-120 Exam Questions
 
Link Planner Proposal Report, Estudio de Enlace Inalambrico 5.8 GHz
Link Planner Proposal Report, Estudio de Enlace Inalambrico 5.8 GHzLink Planner Proposal Report, Estudio de Enlace Inalambrico 5.8 GHz
Link Planner Proposal Report, Estudio de Enlace Inalambrico 5.8 GHz
 
Kuo wei's thesis
Kuo wei's thesisKuo wei's thesis
Kuo wei's thesis
 
IPv6/IPv4 Transition: The experience sharing of Tunnel Broker deployment
IPv6/IPv4 Transition: The experience sharing of Tunnel Broker deployment IPv6/IPv4 Transition: The experience sharing of Tunnel Broker deployment
IPv6/IPv4 Transition: The experience sharing of Tunnel Broker deployment
 
VYATTAによるマルチパスVPN接続手法
VYATTAによるマルチパスVPN接続手法VYATTAによるマルチパスVPN接続手法
VYATTAによるマルチパスVPN接続手法
 
opnet lab report
opnet lab reportopnet lab report
opnet lab report
 
Transport SDN & OpenDaylight Use Cases in Korea
Transport SDN & OpenDaylight Use Cases in KoreaTransport SDN & OpenDaylight Use Cases in Korea
Transport SDN & OpenDaylight Use Cases in Korea
 
Introduction to PROFINET - Derek Lane of Wago
Introduction to PROFINET -  Derek Lane of WagoIntroduction to PROFINET -  Derek Lane of Wago
Introduction to PROFINET - Derek Lane of Wago
 
RAO ABDUL KHALIQ-Probation Presentations-V1.0.pptx
RAO ABDUL KHALIQ-Probation Presentations-V1.0.pptxRAO ABDUL KHALIQ-Probation Presentations-V1.0.pptx
RAO ABDUL KHALIQ-Probation Presentations-V1.0.pptx
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video Streaming
 
Analytical Modeling of End-to-End Delay in OpenFlow Based Networks
Analytical Modeling of End-to-End Delay in OpenFlow Based NetworksAnalytical Modeling of End-to-End Delay in OpenFlow Based Networks
Analytical Modeling of End-to-End Delay in OpenFlow Based Networks
 
WebRTC: A front-end perspective
WebRTC: A front-end perspectiveWebRTC: A front-end perspective
WebRTC: A front-end perspective
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Alfredo paganophd 3y

  • 1. Ferrara, Tuesday, January 27, 2015 Corso di Dottorato in Matematica e Informatica Università degli studi di Ferrara 2nd Year PhD Activities Report (2009) Alfredo Pagano Advisor: Prof. Eleonora Luppi Coadvisor: Dr. Mario Reale 1
  • 2. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 2 Present employment Sysadmin in charge of installing, supporting, and maintaining servers and core services at GARR The aim of Consortium GARR is to plan, manage, and operate the Italian National Research and Education Network, implementing the most advanced technical solutions and services. Networking Support Activity (EGEE-III SA2) Enabling Grids for E-sciencE (EGEE) is Europe's leading grid computing project, providing a computing support infrastructure for over 10,000 researchers world-wide, from fields as diverse as high energy physics, earth and life sciences.
  • 3. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 3 GARR-G: Backbone Physical Infrastructure GARR-G: PoP-level topology  About 45 GARR PoP  90% in Univ. research institutions premises  other in telco operators and Internet eXchange premises  About 60 backbone links  Mainly leased lines from 8 telco operators  Core links  10 Gbps (STM-64) and 2.5 Gbps (STM- 16)  Edge links  34 Mbps, 155 Mbps and 622 Mbps  Peering links  10 Gbps (STM-64 and 10GigabitEthernet), 2.5 Gbps (STM-16) and 1 Gbps (1 GigabitEthernet)  Backbone capacity 120 Gbps  Peering capacity 40 Gbps
  • 4. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 4 GARR-G: Backbone Logical Infrastructure  Basically each IP backbone link corresponds to a leased lines from different operators  GARR-G is an IP multivendor network  Juniper M320, M20 and M10i  Cisco 12000, 7500 and 7200 20 RT.FI1 RT.TO1 RT.MI2 GEANT TELIA 20 5 9 10 11 20 10 20 MIX GlobalX RC.GE1 RC.TS1 14 30 50 1 1 1 RT1.BO1 RT1.MI1 RT.MI3 18 1 RC.TN1 19 RT.BA1 10 RC.MTRC.SA RC.CB 111 1 1 1 RC.LE RC.FG RC.PZ Juniper M320 Juniper M20 Cisco 7500 Telecom Infracom Interoute Fastweb Wind D.F/ Peering 9 RT.CT1 RC.ME RC.CS 1 1 20 RC.PA1 20 RT.NA1 19 RC.PG RC.AQ RC.AN 111 RT.RM1 RC.CA RC.SS 1 1 RT.RM2 Level 3 20 1 RC.RM2 1 NaMeX RC.FRA 20 1 RT.PI1 RC.FI 19 4 RC.VE RT.PD1 1 25 20 Cisco 7200 1 1 50 RC.AQ1 100 1 2.5GB 2.5GB 2.5GB 1 + 2 GB 10 GB 2.5GB 10GB 2.5GB 155MB 2X 155 2.5GB 155MB 2.5GB 10GB 2.5GB 10 GB 2 X 10 GB 155 MB 155 MB 2.5GB 2.5GB 34MB 2.5GB 2.5GB 10 GB 155 MB 155 MB 155MB 155 M B 2X155 1 + 2.5 GB 1+2.5GB 10 + 1 GB 2.5GB 2X 155 2.5GB 1GB 2.5GB 155 MB 2X155MB 155MB 155 MB 2 X 155 MB 2.5GB 155MB 155MB 2.5GB 2.5GB 2 X 155 2 X 155 155MB 155MB RC.CA1 Juniper M10i 155 MB 1 RC.PV 2X155MB 1 Juniper M7i 1 155 MB RC.UR 20 RT.FI1 RT.TO1 RT.MI2 GEANT TELIA 20 5 9 10 11 20 10 20 MIX GlobalX RC.GE1 RC.TS1 14 30 50 1 1 1 RT1.BO1 RT1.MI1 RT.MI3 18 1 RC.TN1 19 RT.BA1 10 RC.MTRC.SA RC.CB 111 1 1 1 RC.LE RC.FG RC.PZ Juniper M320 Juniper M20 Cisco 7500 Telecom Infracom Interoute Fastweb Wind D.F/ Peering 9 RT.CT1 RC.ME RC.CS 1 1 20 RC.PA1 20 RT.NA1 19 RC.PG RC.AQ RC.AN 111 RT.RM1 RC.CA RC.SS 1 1 RT.RM2 Level 3 20 1 RC.RM2 1 NaMeX RC.FRA 20 1 RT.PI1 RC.FI 19 4 RC.VE RT.PD1 1 25 20 Cisco 7200 1 1 50 RC.AQ1 100 1 2.5GB 2.5GB 2.5GB 1 + 2 GB 10 GB 2.5GB 10GB 2.5GB 155MB 2X 155 2.5GB 155MB 2.5GB 10GB 2.5GB 10 GB 2 X 10 GB 155 MB 155 MB 2.5GB 2.5GB 34MB 2.5GB 2.5GB 10 GB 155 MB 155 MB 155MB 155 M B 2X155 1 + 2.5 GB 1+2.5GB 10 + 1 GB 2.5GB 2X 155 2.5GB 1GB 2.5GB 155 MB 2X155MB 155MB 155 MB 2 X 155 MB 2.5GB 155MB 155MB 2.5GB 2.5GB 2 X 155 2 X 155 155MB 155MB RC.CA1 Juniper M10i 155 MB 1 RC.PV 2X155MB 1 Juniper M7i 1 155 MB RC.UR
  • 5. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 5  PhD subject  Motivation, Mission and Scope  The idea: pros and cons  Metrics collected  First results  Next steps 5
  • 6. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 6 PhD subject  1st year activity - The research activity was focused on exploiting and testing network monitoring tools to gather relevant metrics related to end-to-end performances in Grid infrastructures.  2nd year activity – Design and create a first working version of a Grid Network monitoring tool based on Grid jobs
  • 7. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 7 Motivation (1/2)  Debugging networks for efficiency  an essential step for those wishing to run data intensive applications  Optimizing performances of Grid middleware and applications to make intelligent use of the network  adapting to changing network conditions  Supporting the Grid “utility computing” model  ensuring that the differing network performances required by particular Grid applications are provided, via measurable SLAs
  • 8. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 8 Motivation (2/2)  An help for Site and Grid operations  Help diagnose performance problems among sites  This transfer is slow, what’s broken? – the network, the server, the middleware…  I can’t see site X, has the network gone down or is it just a particular service or machine?  My application’s performance varies with time of day – is there a network bottleneck?  Help diagnose problems within sites  Most network problems, especially performance issues, are not backbone related, they are in the “last mile”  Help with planning and provisioning decisions  Is an SLA I’ve arranged being adhered to by my providers?  For Grid services and middleware  I want to increase the performance of file transfers between sites  I want to know which compute site is “closest” to my data to submit a job to it
  • 9. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 9 The idea: pros and cons Instead of installing a probe at each site, run a Grid Job!  Added value:  No installation needed at the sites  Monitoring system running on a proven system (the grid) & possibility to use grid services  Direct use of grid AuthN and AuthZ  Limits:  The job is not running with root privileges on the Worker Node (WN)  Some low level operations are not permitted  Heterogeneity of the WN environments (OS, 32/64 bits, etc.)  Ex: making the job download-and-run an external binary may be tricky (except if they are written in an OS independent programming language)  The system has to deal with the grid mechanism overhead (delays…)
  • 10. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 10 The system in action Site paris-urec-ipv6 UI Central monitoring server program (CMSP) Site A WN Job Site X WMS CE Site B WN Job CE Site C WN Job CE Job submission Socket connection Ready! Probe Request Request: RTT test to site A Request: RTT test to site A Request: BW test to site B Request: BW test to site B
  • 11. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 11 Some remarks  Chosen design is more efficient than starting a job for each probe (considering delays)  TCP connection is initiated by the job  No open port needed on the WN -> better for sites security  An authentication mechanism is implemented between the job and the server  High scalability (Bend and Fend can be easily decoupled)  A job cannot run forever (GlueCEPolicyMaxWallClockTime)  there are two jobs running at each site  A ‘main’ one  A ‘redundant’ one which is waiting and will become ‘main’ when the other one ends
  • 12. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 12 Round-Trip time, MTU and hop count tests Site paris-urec-ipv6 UI Central monitoring server program (CMSP) Site B WN Job Site C CE Socket connection Probe Request Request: RTT test to site C Request: RTT test to site C Probe Result
  • 13. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 13 Round-Trip time, MTU and hop count tests  The ‘RTT’ measure is the time a TCP ‘connect()’ function call takes:  Because a connect() call involves a round-trip of packets:  SYN ->  SYN-ACK <-  ACK ->  Results similar to the ones of ‘ping’  The MTU is given by the IP_MTU socket option  The number of hops is calculated in an iterative way  All these measures require:  To connect to an accessible port (1) on a machine of the remote site  To close the connection (no data is sent)  Note: This (connect/disconnect) is detected in the application log  (1): We use the port of the gatekeeper of the CE since it is known to be accessible (it is used by gLite) Round trip Just sending => no network delay
  • 14. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 14 WN-to-WN BW test (may be obsoleted) Site paris-urec-ipv6 UI Central monitoring server program (CMSP) Site A WN Job Site C WN Job Probe Request Request: BW test to wn-site-C:<p> Request: BW test to wn-site-C:<p> Request: Open a TCP port <p> Request: Open a TCP port <p> Socket connection Sending a big amount of data Probe Result
  • 15. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 15 GridFTP BW test Site paris-urec-ipv6 UI Central monitoring server program (CMSP) Site A WN Job Site C Probe Request Request: GridFTP BW test to site C Request: GridFTP BW test to site C Socket connection SE SE Replication of a big grid file Read the gridFTP log file Probe Result
  • 16. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 16 WN-to-WN BW test (under discussion)  It requires the remote site to allow incoming TCP connections to the WN  Not a best practice security policy  Not always possible (WNs behind a NAT)  Workaround are sometimes possible  WN WN transfers doesn’t reflect real use case  The WN network connectivity may not be adapted
  • 17. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 17 Metrics collected & scheduling  Latency test  Ping  Every 5 minutes  Hop list  Traceroute  Every 5 minutes  MTU size  Socket (IP_MTU socket option)  Every 5 minutes  Achievable Bandwidth  TCP throughput transfer via GridFTP transfer between 2 Storage Element  Every 8h
  • 18. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 18 8 Sites involved A. Paris Urec CNRS B. IN2P3 Lyon C. INFN-CNAF D. INFN-ROMA1 E. INFN-ROMA-CMS F. GRISU-ENEA-GRID G. INFN-BARI H. INFN-CATANIA
  • 19. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 19 Traceroute Paris-Catania  [pagano@ui-ipv6-testbed ~]$ traceroute grid005.ct.infn.it traceroute to grid005.ct.infn.it (193.206.208.18), 30 hops max, 38 byte packets 1 194.57.137.190 (194.57.137.190) 1.589 ms 1.479 ms 2.696 ms 2 r-interco-urec.reseau.jussieu.fr (134.157.247.38) 0.331 ms 0.273ms 0.348 ms 3 r-jusrap-reel.reseau.jussieu.fr (134.157.254.124) 0.368 ms 0.320ms 0.318 ms 4 interco-6.01-jussieu.rap.prd.fr (195.221.127.181) 0.258 ms 0.330ms 0.255 ms 5 * * * 6 te1-2-paris1-rtr-021.noc.renater.fr (193.51.189.230) 1.224 ms1.127 ms 1.121 ms MPLS Label=489 CoS=6 TTL=1 S=0 7 te0-0-0-3-paris1-rtr-001.noc.renater.fr (193.51.189.37) 1.182 ms1.298 ms 1.150 ms 12 rt1-mi1-rt-mi2.mi2.garr.net (193.206.134.190) 17.555 ms 17.533ms 17.646 ms 13 rt-mi2-rt-rm2.rm2.garr.net (193.206.134.230) 26.996 ms 27.071 ms 27.183 ms 14 rt-rm2-rt-rm1-l1.rm1.garr.net (193.206.134.117) 27.050 ms 27.160ms 27.062 ms 15 rt-rm1-rt-ct1.ct1.garr.net (193.206.134.6) 44.854 ms 44.882 ms 44.820 ms 16 rt-ct1-ru-infngrid.ct1.garr.net (193.206.137.186) 45.115 ms 45.013 ms 45.009 ms 17 grid005.ct.infn.it (193.206.208.18) 45.014 ms 44.967 ms 44.913 m 8 renater.rt1.par.fr.geant2.net (62.40.124.69) 1.154 ms 1.153 ms 1.128 ms 9 so-7-3-0.rt1.gen.ch.geant2.net (62.40.112.29) 9.950 ms 9.958 ms 10.018 ms 10 so-3-3-0.rt1.mil.it.geant2.net (62.40.112.210) 17.264 ms 17.314ms 17.372 ms 11 garr-gw.rt1.mil.it.geant2.net (62.40.124.130) 17.369 ms 21.514ms 17.368 ms
  • 20. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 20 Frontend view: Ldap Authentication, based on Google Web Toolkit (GWT) framework
  • 21. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 21 Next steps: 1. Triggering system to alert site and network admins 2. Frontend improvements (plotting graphs) 3. Not only scheduled, but also on-demand measurements …
  • 22. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 22 Thank you for your attention!
  • 23. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 23 Backup
  • 24. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 24 GridFTP BW test  This test shows good results  If the GridFTP log file is not accessible (cf. dCache?)  We just do the transfer via globus-url-copy and measure the time it takes  This is slightly less precise  How many streams should we request in the command line? globus-url-copy –p <num_streams> […]
  • 25. Ferrara, Tuesday, January 27, 2015 Alfredo Pagano 25 Network Performance Factors  End System Issues  Network Interface Card and Driver and their configuration  TCP and its configuration  Operating System and its configuration  Disk System  Processor speed  Bus speed and capability  Application eg old versions of scp  Network Infrastructure Issues  Obsolete network equipment  Configured bandwidth restrictions  Topology  Security restrictions (e.g., firewalls)  Sub-optimal routing  Transport Protocols  Network Capacity and the influence of Others!  Many, many TCP connections  Congestion