Más contenido relacionado

Similar a Workshop on Network Management and Monitoring Summary(20)

Más de CSUC - Consorci de Serveis Universitaris de Catalunya(20)


Workshop on Network Management and Monitoring Summary

  1. Workshop on Network Management and Monitoring - Summary Maria Isabel Gandia, CSUC/RedIRIS GN4-3 WP6 T3 / CNaaS 10th SIG-NOC Meeting Prague, 14 November 2019
  2. 2 The Workshop on Network Management and Monitoring • NORDUnet, Copenhagen, 21-22 October (before the STF meeting) • 51 (38 in person + 13 remote) participants • 28 NRENs/countries • Explore several topics: • Organising network management for end-institutions • Tools for end institution management • Monitoring end institution networks • Automating management functions • Four sessions: 1. End institution management: an introduction (Intro+8 LT) 2. End institution network management outsourcing (4) 3. Technical solutions for monitoring of the outsourced networks (3) 4. Technical solutions for network management (2 + conclusions)
  3. 3 End institution management • 10 years ago – “absolutely no way we are going to do this” • CNaaS initiative from SUNET and UNINETT - NRENs started to plan/offer the service to manage the university campus networks – Campus Network management as a Service • CNaaS – a subtask of GN4-3 WP6T3 (Monitoring and management) • But also… • SURFnet is a pioneer among NRENs in automated management infrastructure • ARNES, CARNET, AMRES, KIFU/Hungarnet are managing parts of the school infrastructures and/or WiFi infrastructures in the end institutions • FUNET manages the CPEs at the institutions • And we heard other NRENs are investigating whether they should go into that direction… • Why did NRENs start to think about and do this?
  4. 4 Tech talent shortage • 63% of senior execs indicated that a talent shortage was a key concern for their organisation.
  5. 5 A retiring baby boomer generation, a deficiency in STEM graduates, and an increase in millennials’ lack of interest in technical careers or a career path New (cool) skills needed
  6. 6 So, what could be the solution?
  7. 7 And the Cloud is not Perfect
  8. 8 So... • End institutions are losing tech people • NRENs are here, we know them, they have a good reputation, let’s ask them…(The same regulation, data privacy rules, no issues as with cloud services) • Pressure from other NREN stakeholders (government) • Adding new services in a situation with a tech talent shortage • So the NRENs are pushed to do more, while suffering from the same problems as end institutions • Automation is one part of the solution
  9. 9 Session 1: Lightning Talks (I) • UNINETT (Vidar Faltinsen), Norway • Dedicated Department for services in the Campus network. • CNaaS services development started this year: improved security and better quality for Campus networks. Running one pilot with a university. • Digitalisation strategy 2017-2021 from the Ministry of Education  use common services. • UNINETT buys the equipment for the customer • Local hands and heads still needed (rack mounting, patching…). UNINETT NOC involved. • CNaaS package: management and monitoring, but also DHCP, NAT, RADIUS, VPN • Planning FW, DNS, IDS • Monitoring/Automation with NAV (developed by UNINETT) • SUNET (Dennis Wallberg), Sweden • 2 full-time developers hired for CNaaS • Initial production planned in early 2020, with one customer. Equipment already procured. • Only greenfield installations, no brownfield • Helpdesk, hands and feet at the university, SUNET NOC second level • Building the NMS/automation architecture • Planning Zero Touch Provisioning in the near future and monitoring with NAV
  10. 10 Session 1: Lightning Talks (II) • FUNET (Asko Hakala), Finland • Started in 2012 with CPE management - 17 customers, 33 routers • 3-person team from January • FUNET Kampus service started in 2019 • 2 big and 7 small deployments • FUNET buys and leases the equipment to the customers • Installation done by the customer (if not, it has a cost). • Everything automated using Ansible, configuration stored in YAML files. • Same alert and monitoring tools as for the Funet network. • SURFNET (Peter Boers), Netherlands • 53 FTE for network (7 full-time developers), 25% externalised. • First Campus service was Surfwireless • Strategy is on SURFNET. Day-to-day management is outsourced to Quanza. • Everything automated, connecting blocks through standardised interfaces.
  11. 11 Session 1: Lightning Talks (III) • ARNES (Matej Vadnjal), Slovenia • Operations  existing NOC team of 5 members. New project planning  external contractors (2 people reviewing the documentation). Software development  dedicated team of 4 members (+1 student) • Already managing the last mile circuit (650 routers, 1,300 switches). • WiFi Project WLAN2020 to provide a centralised managed eduroam/WiFi service in the country for every primary and secondary school. Offering RADIUS as a Service. • ARNES runs the procurement for the equipment, that is owned by the institution. • Expect to manage ~20,000 APs, 2,000 switches, 450 routers, 955 campus networks by 2020. • ARNES network service orchestration stack, automation based on Ansible. • Running brownfield networks is challenging. • CARNET (Darko Parić/Bojan Schmidt), Croatia • E-Schools project started in 2015 • 35,000 APs, 80,000 switches, 70,000 devices (laptops, tablets…), LAN, interactive equipment for classrooms… • Upgrade in the backbone needed. • Everything that can be migrated to the cloud, they pull it out from schools. • 1st level support at school, 2nd level at CARNET, 3rd level at CARNET/vendor. • GDPR 360: system for user data management • Working on automation, Looking for solutions for school LAN management
  12. 12 Session 1: Lightning Talks (IV) • KIFU/HUNGARNET (Attila Gyürke), Hungary • Responsible for all the schools networking. StudentNet programme. • 7,000 monitored CPE devices. • The plan is to insource outsourced services like the call centre. • AMRES (Bojan Jakovljević), Serbia • Three flavours: • CPE management since 2013 (equipment bought and owned by AMRES): 250 CPE routers • AMRES managed wireless infrastructure, since 2014 (donated equipment, owned by AMRES): 6,000 AP installed, 6 controllers (through SP managed services). • LAN infrastructure in schools (2019-2021): Ministry of telecommunication runs tender and buys equipment. In 2020: 15,000 APs, 2021: 24,000 APs (1,500 institutions) • AMRES services are free of charge for the institutions. Best effort. • Fewer engineers. Grown from 290 institutions in 2016 to 1,930 now. • They see the benefits of automation, but are too busy operating the network. • Need to hire and outsource some operational tasks.
  13. 13 Session 2(I): CNaaS Service Definition/Checklist (MI Gandia, CSUC) • What do you need to think about, beyond the technical stuff? • A Service Definition template/checklist, including: • Terminology • Contacts/Roles for the provider and the customer • Service Delivery Model (Service packages, service elements…) • Service Policy (KPI, SLT, Responsibilities…) • Duration, Changes and Termination • Prices and Billing • GDPR Privacy Note • References
  14. 14 Session 2(II): Software Architectures • SURFNET network management architecture and orchestration (Peter Boers, SURFNET): • It’s not just automation or CNaaS. Orchestration is the heart of SURFNET8. No CLI provisioning allowed. • Doing orchestration for 2+ years, 100+ products, 2,000+ changes. • Running 3500 background jobs every day to check the network. • Defining processes and workflows correctly is the key. • The orchestrator is a home grown application using python and postgres. • 10 FTE directly involved. • Outsourcing automation software architecture in SUNET (Johan Marcusson, SUNET): • Goals of CNaaS NMS: ZTP • Design principles defined. • Design decisions made: Nornir/NAPALM instead of Ansible, to make the process easier. • Config replace instead of config merge makes the management easier too. • All configurations made via git. First, dry run, then live. • They have run tests in 1,000 mock devices (no customers in production).
  15. 15 Session 2(III): Software Architectures • Outsourcing service Management architecture in FUNET (Asko Hakala, FUNET): • Everything is done using Ansible and Jinja2. Configuration stored in YAML files. • They can configure the routers before sending them. Everything quite standard. • Fully automated. • Important to test before running and have git up-to-date. • Separated customer management server. • The initial configuration is done via a 4G OOB.
  16. 16 • CNaaS dashboard with HTTP and DNS measurements with Linux namespaces (Tsotne Gozalishvili, GRENA) • Monitoring probe in the fixed network. Box that connects to the network. • They visualize results from perfSONAR measurements with the ELK stack. • Several dashboards, like for DNS test results, VLAN status, etc. • WiFiMon: Overview & Summary of Y1 Activities (Nikos Kostopoulos, GRNET/NTUA) • Monitoring probe in the WiFi network. Raspberry Pi devices. • It monitors the performance from the perspective of end users. • Correlating accounting data from RADIUS and performance data from users. • WiFiMon service planned to be released in 2020. Session 3: Monitoring of the Outsourced Networks (I)
  17. 17 Session 3: Monitoring of the Outsourced Networks (II) • Monitoring and alert aggregation (Morten Brekkevold, UNINETT): • Network Monitoring toolkit for campuses since 2006  NAV for CNaaS monitoring. • NAV is not multi-tenant one instance per customer. • Need for SSO support. • They built an aggregator. Developed by students. • Requirements defined by UNINETT.
  18. 18 Session 4: Technical Solutions for Network Management (I) ● NMaaS as a platform for management service outsourcing (Lukasz Lopatowski, PSNC) ● Kubernetes/docker platform for providing per-tenant management apps. ● Suitable for small NRENs and other teams in the GÉANT project. ● Options: supported by GÉANT or NREN deploys its own instance. ● Portfolio: Oxidized, LibreNMS, NAV, Prometheus, Grafana. PerfSONAR soon.
  19. 19 Session 4: Technical Solutions for Network Management (II) ● RENATER's White Box CPE in Normandy Regional network (Xavier Jeanin, RENATER) ● RARE: Router for Academia, Research and Education ● Features developed: IPv4, IPv6, MPLS, SR-MPLS, L3VPN, XConnect, VPLS, EVPN, 6VPE ● No SNMP, but streaming telemetry ● White boxes: ● Switch/router manufactured from commodity components that allows different Network Operating Systems (NOS) to be run on the same piece of hardware (Dell VEP 4600 servers, FRRouting) ● Initially designed for data centre use. ● Use case in the Normandy Regional network, for school CPE routers. ● Features: BGP peering, IGP, VLAN, Logical interface, VRF lite, management (SSH, Syslog, SNMPv2) and security (line-rate IPv4/IPv6 L3 ACLs, Broadcast storm protection) ● Ansible based automation
  20. 20 Some Conclusions (I) ● NRENs are pushed to offer CNaaS services, without increasing the number of employees: ● The use of automation is key to allow these services to grow. ● Some NRENs are also outsourcing some functions to offer CNaaS services. ● Services can differ from NREN to NREN, there’s no single approach: CNaaS, e-Schools, WiFi2020, management of CPEs… ● User groups define the functionalities of a service - a service can differ per user group inside the same NREN.
  21. 21 Some Conclusions (II) ● What can the GÉANT project do? ● Sharing information is important: organise more meetings to share stories and how-to guides. ● Having a set of recommendations to create Service Definition documents is useful. ● Contributions from multiple people (including students' work) is managed through fully integrated CI/CD (Continuous Integration/Continuous Delivery), code audits, well defined and regularly executed tests. ● Kubernetes/docker based multi-tenant app provisioning seems to be the way forward (NMaaS). ● A very lightweight perfSONAR (on rPi) for monitoring boxes could be useful, perhaps integrated with WiFiMon on the same device.
  22. Thank you Any questions? © GÉANT Association on behalf of the GN4 Phase 3 project (GN4-3). The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 856726 (GN4-3).