On November 17th, 2016, Forward Networks conducted its first public unveiling of its Network Assurance platform at Networking Field Day 13. Visit https://www.forwardnetworks.com/ for more details.
2. AGENDA
+ An Introduction to Forward Networks
+ Platform Demo
+ Use Case: Outage Diagnosis & Resolution
+ Use Case: Network Auditing
+ Closed Session
3. Today’s Networks – Large, Complex, &
Heterogeneous
+ IPv4 routes
+ ACLs
+ MAC tables
+ Spanning tree
+ NAT
+ VLAN
+ Multicast
+ PBR
+ Cisco
+ Arista
+ HPE
+ Fortinet
+ Juniper
+ F5
+ Palo Alto
+ Checkpoint
Thousands of devices Millions of rules Dozens of vendors
Switches Routers
Load balancers Firewalls
4. Manual Operations Inadequate Tooling High Rate of Error
+ Device-by-device management
+ Limited end-to-end visibility
+ Hard to debug & test
+ Lack of innovation in tooling
+ Solutions are 20+years old
+ Ping, traceroute, SNMP, etc.
+ Networks rife with misconfiguration
+ 80% of outages caused by error1
+ 50% due to change config issues2
1&2
Network Operations – Manual & Error
Prone
6. NETWORK ASSURANCE
Reducing the complexity of networks while eliminating the human
error, misconfiguration, and policy violations that lead to outages.
7. Unorganized real world
data
Own data model of real
world
Apps on top using data
model
Revolutionary algorithm
SEARCH VERIFY APIPREDICT
A NEW APPROACH TO NETWORK OPERATIONS
8. Unorganized real world
data
Own data model of real
world
Apps on top using data
model
Revolutionary algorithm
SEARCH VERIFY APIPREDICT
THE FORWARD
PLATFORM
A NEW APPROACH TO NETWORK OPERATIONS
10. What is my network’s
behavior?
Index your network and search
your devices and behavior on top
of an interactive topology
SEARCH
Is it doing what it should?
Validate network correctness and
audit your network for compliance
& security
VERIFY
Will this change work?
Simulate configuration changes to
ensure they are correct and secure
before rolling into production
PREDICT
THE FORWARD PLATFORM
CAPABILITIES OVERVIEW
13. - Interface Counters
- Flow Counters (NetFlow)
- Sampled Counters (sFlow)
- Probes (Ping, Traceroute)
+ Packet In -> Packet Out
(and all details)
(for any packet, seen or not)
Observed Traffic All Potential Traffic
What we don’t do What we do
18. REQUIREMENTS
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
19. REQUIREMENTS
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
20. REQUIREMENTS
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
21. REQUIREMENTS
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
24. TRADITIONAL APPROACH
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
26. FORWARD VERIFY™
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
29. REQUIREMENTS
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
1. Traffic should flow from CLIENT to SERVER
2. Traffic should take multiple paths from CLIENT to SERVER
3. Traffic should flow on all interfaces in a port channel
35. CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
Latent
misconfiguration
Traditional Approach
FORWARD VERIFY™
VERIFICATION COMPARISION
36. Traditional Approach
FORWARD VERIFY™
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
VERIFICATION COMPARISION
Latent
misconfiguration
37. Traditional Approach
FORWARD VERIFY™
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
CLIENT SJCCE
SEA
LAX MIA
LGA
IAD SERVER
(18.10.11.2)
VERIFICATION COMPARISION
Latent
misconfiguration
38. FORWARD VERIFY™
PREVENTS OUTAGES
Instantly see failing checks during service window
Fix network issues as soon as they appear
SIMPLIFIES DIAGNOSIS
Using historical snapshots, we could reconstruct
where traffic was going, what had changed, and why
40. FORWARD’S MISSION
We want to help you build networks that work and
that you can trust because you’ve verified them
FORWARD VERIFY™
PREDEFINE
D
CHECKS
The main reason nobody has gone down this path is because it’s incredibly difficult.
The first problem is building models, and this is an enormous grind, because there’s an enormous legacy tail of devices and versions. Then, you have to support all the new versions that come out, quickly. We’ve addressed this by investing heavily in automating our testing pipeline and growing the team by outsourcing.
The second problem is figuring out how to scale network analysis so that it’s fast for any kind of network, on modest hardware. This is yet another grind, and we have about 12 person-years of PhD-level work put into this.
And once you grind through both of those problems, you’ve still got to figure out an interface that makes complex data understandable. You can only solve this by iterating with the users, and we’re on our fourth iteration .
The main reason nobody has gone down this path is because it’s incredibly difficult.
The first problem is building models, and this is an enormous grind, because there’s an enormous legacy tail of devices and versions. Then, you have to support all the new versions that come out, quickly. We’ve addressed this by investing heavily in automating our testing pipeline and growing the team by outsourcing.
The second problem is figuring out how to scale network analysis so that it’s fast for any kind of network, on modest hardware. This is yet another grind, and we have about 12 person-years of PhD-level work put into this.
And once you grind through both of those problems, you’ve still got to figure out an interface that makes complex data understandable. You can only solve this by iterating with the users, and we’re on our fourth iteration .
The main reason nobody has gone down this path is because it’s incredibly difficult.
The first problem is building models, and this is an enormous grind, because there’s an enormous legacy tail of devices and versions. Then, you have to support all the new versions that come out, quickly. We’ve addressed this by investing heavily in automating our testing pipeline and growing the team by outsourcing.
The second problem is figuring out how to scale network analysis so that it’s fast for any kind of network, on modest hardware. This is yet another grind, and we have about 12 person-years of PhD-level work put into this.
And once you grind through both of those problems, you’ve still got to figure out an interface that makes complex data understandable. You can only solve this by iterating with the users, and we’re on our fourth iteration .
So how is Forward Networks addressing these challenges. To begin with, we learned from the leader in a different space, that also had similar challenges, Google. So what did they do.
-First they built a crawler to go and collect all the web data that they possibly could
-Second they parsed all of this data and created their own internal copy of the web
-Third they applied their revolutionary algorithm named Page Rank that amazing user experiences with a variety of applications on top such as
-Search, Maps, Contacts, etc, etc, etc
Summary: Google revoultionized the user experience of search by gathering *all* the data, applying smart algorithms to it, then putting a slick UI in front. We have taken a similar path to revolutionize network operations.
-First we collect both configuration, and dynamic runtime state from all devices in the network (switches, routers, load balancers, firewalls, etc)
-We bring that data centrally, and for every device we use our revolutionary algorithms (originally called Header Space Analysis) to precisely model how it will behave for any packet it receives, given the current configuration and state. So effectively we have an entire copy of your network, in software.
-On top of that copy of your network we use our algorithms to trace through that copy of your network where every possible packet could go, then we put that in a database, and it is the core of our data. This is unique to Forward Networks, nobody else has this level of data about the network.
-Collectively we call this the Forward Platform.
-On top we we add applications that utilize this data to present experiences that solve the problems mentioned earlier.
OPTION 1
It’s all potential traffic. We’ve traced through every possible path that traffic could take. So for any packet of your choice, you can see what will happen to it. You can see if it’ll get dropped, or if it passes through, and how it’s changed. You can see everything relevant to the story of any potential packet. What makes this model powerful is that we can reason about packets the network has not even seen.
Note what is not covered in the model.
We don’t look at interface counters.
We don’t look at flow counters.
We don’t look at sampled counters.
We don’t look at probes.
Hi, I’m Behram Mistree. And I’m going to be doing a couple of live demos of Forward Verify, and hopefully showing you that it’s going to make your lives easier and solve a bunch of your problems.
So, before all of you got here, we talked to customers
We talked to network engineers
We did a lot of reading
We did a lot of testing
And what we were looking for were good examples of outages.
We wanted those outages to be:
* Nasty
* Potentially catastrophic
* Real
And that’s what this is.
A network and a set of steps that caused an outage in the network.
Now let’s not focus on the details of the outage here in the bottom left corner. We’ll get into that in a bit. Let’s just take a look at the network itself.
To answer this fundamentally important question: Is my network working? I had a network engineer
* Log into a bunch of boxes
* Run a bunch of commands
* Parse their output and
* Get back to me
Now, let’s look at another way.
Before I had to log into all these boxes, execute a bunch of commands. Now, I just press a single button, and get the answer whenever I want.
Now let’s bubble up a minute about what that means for a second.
Every day, you’re betting your business on this.
You’re betting your business on having skilled network engineers that know:
* What it means for your network to be working
* That know how to log into these boxes, run their commands, and verify that they’re working
* And that those engineers have the time and capacity to do that frequently enough that you’re going to catch important issues early and before they cascade.
And for the rest of this talk. I’m going to show you how that bet can go wrong, and what happens when it does.
In the previous demo, we saw how Forward Verify could have prevented an outage. We’re going to continue on with that them in this demo by focusing on one component of the entire Forward Verify experience, called Predefined Checks.