This event took place on 27 October 2021.
In this Tech 2 Tech session, we considered questions such as:
- Which types of applications need low latency, and what are their specific requirements for both latency and jitter?
- What levels of latency might you expect across Janet?
- What can you do to optimise latency for your networked applications?
- How can we measure latency and jitter?
3. Overview
Todayâs session
â˘Network performance is typically focused on achieving good throughput
for large scale data transfers
â˘See our May T2T event - https://www.jisc.ac.uk/events/tech-2-tech-
network-performance-24-may-2021
â˘Weâve noticed weâre getting an increasing number of queries about low
latency networking
â˘In this session we aim to identify use cases that matter to our members,
how these might be delivered, and what measurement tools weâd like to
have available
3
4. Use cases for low latency networking
Application areas?
â˘Distributed performing arts
â˘Haptic / remote control applications via IP
â˘Distributed storage / databases
â˘Conferencing tools, voice over IP (VoIP)
â˘Virtual reality (VR) headsets
â˘Transnational education (TNE)
â˘Gaming
â˘Q: Do we know specific latency requirements?
4
5. Distributed performing arts
Multi-site performances over the Internet
â˘Orchestras
â˘Musicians at different locations
â˘Remote conductor
â˘Theatre, plays
â˘Actors in multiple locations
â˘Example application - LoLa
â˘See https://lola.conts.it/ - a GARR project
â˘Example https://www.youtube.com/watch?v=LK2WNyfLGlc
â˘OWD needs to be below âthreshold of perception for temporal segregationâ â 30ms
5
6. Haptic / remote control applications
Using the network for remote control
â˘Controlling devices from afar
â˘Might be gloves locally, robot arm remotely
â˘Haptics implies (force) feedback
â˘Various application areas, including
â˘Medical
â˘Teaching / learning
â˘Joystick control of remote device
â˘Shaving?
â˘EE TV ad: https://www.youtube.com/watch?v=gWiV3DF5JkU
6
7. Distributed storage / databases
Latency requirements may be strict
â˘Use case might be a resilient database configuration or some
form of distributed file system or cluster
â˘May become an issue when multi-site
â˘e.g., local campus and remote data centre
â˘Seeing more questions from members in this area
â˘A recent example:
â˘Dell EMC VxRail 7.0 vSAN stretched cluster
â˘Requires RTT between sites hosting VM objects < 5ms
â˘Is that achievable reliably between X and Y?
7
8. Other use cases
IncludeâŚ
â˘Conference tools â Zoom, Teams, ⌠3rd party servers
â˘VoIP â Probably widely deployed on campuses by now
â˘TNE â improving the experience for remote learners
â˘Virtual reality (VR) â between device and compute
â˘Gaming â campuses have students!
â˘These have a wide range of requirements
8
9. Any use cases we missed of interest to you?
Or any questions?
10. Latency expectations
What can you expect?
â˘Latency largely determined by distance, and speed of light in fibre
â˘But the fibre path wonât be as the crow flies
â˘Latency will be the result of the sum of all elements on the path,
including end systems and devices, access network, network elements,
and the distance involved
â˘Ball park OWD between site border routers?
â˘Between nearby sites on Janet: ~1ms
â˘Between distant sites on Janet: 6-8 ms
â˘Between Janet and the US east coast: ~35ms
10
11. Latency on / across Janet
What should I expect?
â˘The Janet network is being refreshed
â˘Backbone network with core PoPs and IX presence
â˘Much of the focus is on capacity, use of 400G, n x 100G
â˘Regional networks are being updated through the access programme â join one
of our T2T access programme update sessions to learn more
â˘There is no latency SLA on Janet
â˘Though there is also no throughput SLA â but we give advice and guidance
â˘Janet Netpath+ circuits provisioned directly on the transmission layer should
have fixed latency
11
12. Access network technology
How does this affect latency?
â˘Janet member sites are generally connected to their access
router via local Ethernet networks
â˘Minimal latency
â˘Other access network technologies will have higher latency
â˘Residential broadband
â˘4G/5G mobile networks
â˘Satellite / LEO (e.g., Starlink)
â˘Users used to typical home network latency can be pleasantly
surprised by what is possible across Janet
12
14. Minimising network latency
Approaches
â˘Optimising equipment, end to end
â˘e.g. dedicated LoLa hardware â PC, camera, codec, displays
â˘Using Science DMZ principles
â˘The friction-free networking principle
â˘Ensuring optimal routing
â˘Not all paths are optimal for latency
14
15. Example: LoLa
Every millisecond counts
â˘Good discussion in the LoLa 2.0 manual - see https://lola.conts.it/
â˘Hardware
â˘Very specific requirements on the PC hardware
â˘Especially video input/output, audio input/output, capture & display
â˘Network
â˘1Gbps+ ethernet (compression saves bandwidth, adds latency)
â˘Switch hardware; must handle 1K packets at high pps rate
â˘Avoid using campus firewall, avoid NAT
15
16. Using Science DMZ principles
General principles
â˘Treat science/research and business traffic differently
â˘But here its latency sensitive applications that need to be treated differently
â˘Elements:
â˘Friction-free network path
â˘Optimise your local network architecture
â˘Efficient application of security policy (avoid main campus firewall)
â˘But instead of well-tuned data transfer nodes (DTNs) for low latency
applications we need optimized hardware as per the LoLa example
â˘Persistent performance monitoring is still important, e.g., perfSONAR
â˘With strong user engagement â know who your low latency users are
16
17. Example classic Science DMZ architecture
10GE
10GE
10GE
10GE
10G
Border
Router
WAN
Science
DMZ
Switch/Router
Enterprise
Border
Router/Firewall
Site
/
Campus
LAN
High
performance
Data
Transfer
Node
with
high-speed
storage
Per-service
security
policy
control
points
Clean,
High-bandwidth
WAN
path
Site
/
Campus
access
to
Science
DMZ
resources
perfSONAR
perfSONAR
perfSONAR
Source: https://fasterdata.es.net
17
18. Optimising routing
Taking the fastest path not necessarily the fattest
â˘Routing metrics may tend to favour higher capacity paths
â˘Latency depends on the path between endpoints, and thus between the
networks that serve them
â˘Interconnects likely to be at major IXs
â˘R&E networks have their own interconnects, e.g., for us via GĂANT
â˘Many large content / cloud providers have their own global networks
â˘CDNs may provide a ânearerâ instance of a service
â˘Services may be pushed to the edge â a feature of 5G
â˘This aims to minimise latency from source to compute
18
20. Measuring latency (and jitter)
A wide range of options
â˘Jisc tools available to members
â˘Netsight3
â˘User tools, for example:
â˘Command line tools
â˘RIPE Atlas â community measurements
â˘Looking glasses â views to you from remote networks
â˘perfSONAR â persistent measurements over time
â˘In-application tools
â˘LoLa has a standalone test tool
â˘Some applications using RTP may report via RTCP (see RFC 6843)
20
22. Command line tools
Simpler tools
â˘ping
â˘traceroute
â˘mtr
â˘âŚ
â˘Quick way to get a feel, but typically limited as only a small
snapshot, using protocols that might be treated differently by
the network to your application traffic
22
23. RIPE Atlas anchor
Worldwide network measurement system
â˘See https://atlas.ripe.net/
â˘Supports measurements from RIPE Atlas nodes
â˘Hardware (available from RIPE) or software probes
â˘The RIPE Atlas ecosystem is mature
â˘Over 11,000 probes around the world
â˘Jisc has an anchor node deployed at Slough
â˘See https://atlas.ripe.net/probes/6695/
â˘Useful for loss and latency, but can also do more bespoke tests
23
24. RIPE Atlas latency world map
A recently published tool using RIPE Atlas data
â˘Shows minimum latency seen into
a given Autonomous System
Number (network) for a given day
â˘Janet is ASN786
â˘Useful for expectations
â˘Note it shows RTT values
24
26. Janet looking glass
Provides views to your site
â˘Under redevelopment, but accessible
â˘See https://alice.ja.net/
â˘Various functions provided:
â˘ping (RTT)
â˘traceroute
â˘BGP route/community/path
â˘Can be run from a range of Janet devices
â˘Feedback welcomed
26
27. Persistent measurement over time: perfSONAR
⢠Free, open source - https://www.perfsonar.net
⢠Easy to download and install on CentOS7 (and Debian)
⢠Very useful to have persistent testing: collect history of network
characteristics â throughput, loss, latency, path
⢠Test against our perfSONAR node in the Jisc Slough data centre
⢠Throughput (up to 10G) - use ps-slough-10g.ja.net
⢠Latency â use ps-slough-1g.ja.net
⢠We are are testing 1Gbps small nodes (including RPi) and Docker versions
⢠Happy to work with sites to test these
27
28. perfSONAR example â UK GridPP mesh
https://psmad.opensciencegrid.org/maddash-webui/index.cgi?dashboard=UK%20Mesh%20Config
Durham â Oxford, last 12 months
28
29. TimeMap
Per-segment latency and jitter measurements
â˘Developed in the GĂANT GN4-3 project
â˘Uses TWAMP / RPM measurements
â˘Running on GĂANT backbone (Juniper)
â˘Moving towards production
â˘https://timemap.geant.org/
â˘Segment by segment
â˘Not an end to end view
29
30. Improved OWD measurement accuracy?
Achieving more accurate OWD measurements
â˘When running OWD measurements accurate time can be important
â˘Is NTP enough?
â˘Typically see 1-2ms variance â see the perfSONAR example
â˘Maybe be partly time synchronization, partly measurement handling
â˘Is there interest in a more accurate time service?
â˘There is the Precision Time Protocol (PTP) â IEEE 1588
â˘One advantage is that PTP is hardware-based
â˘See the perfSONAR teamâs statement â cost is an issue
â˘Might be something to discuss with NPL
30
32. Open questions / discussion
Some closing questionsâŚ
â˘Have we covered low latency networking use cases of interest?
â˘What would you like from Jisc to help you with these?
â˘Do you have the information needed and capability to optimize
latency within your site where needed?
â˘Do you have the tools to measure latency and jitter?
â˘Anything else we missed?
32