2. http://intrbiz.comchris@intrbiz.com
Hello!
● I’m Chris
○ IT jack of all trades
● Mostly a PostgreSQL Consultant
○ Full stack:
■ from electronic design to web dev
● Very much into Open Source
○ Started a monitoring system project a few years ago
○ Big openSUSE and PostgreSQL fan
● Been using and playing with Ceph for a couple of years
○ Build a small VM farm with Ceph for shared storage
4. http://intrbiz.comchris@intrbiz.com
Routed Fabrics, Huh?
● Essentially we make servers participate in routing
○ Every network link the server has is active / active utilised
○ Every server takes part in the routing protocol
○ Routing protocol deals with device and link failures
■ Data just takes another path in the event of a fault
● Equal Cost Multi Path (ECMP) is used to efficiently move traffic
○ IP packets are routed over all available links
○ TCP streams don’t get split across more than one path
■ Single stream is still limited to the bandwidth of your links
○ IE: with 4x 10Gbe NICs we can push 40Gb/s of traffic in aggregate
■ An individual TCP stream maxes at 10Gb/s
5. http://intrbiz.comchris@intrbiz.com
The Build
● My setup is about as small as you can go
● I've my R&D setup
● It's only two switches
● But it's about showing that these approaches work even at small scale
○ All traffic is still routed
○ We still get all benefits of a Routed Fabric
○ We can use cheap commodity switching
○ You don't need super high end kit to get efficiency and speed
● Yes, it's not a real Clos topology, you need a bigger problem domain for that
● This is about thinking about different ways of doing things
22. http://intrbiz.comchris@intrbiz.com
Et Volia
$> ip route
172.26.28.2 proto zebra metric 20
nexthop via 172.31.1.10 dev eth7 weight 1
nexthop via 172.31.1.14 dev eth6 weight 1
nexthop via 172.31.2.10 dev eth4 weight 1
nexthop via 172.31.2.14 dev eth5 weight 1
172.26.28.3 proto zebra metric 20
nexthop via 172.31.1.18 dev eth7 weight 1
nexthop via 172.31.1.22 dev eth6 weight 1
nexthop via 172.31.2.18 dev eth4 weight 1
nexthop via 172.31.2.22 dev eth5 weight 1
...
23. http://intrbiz.comchris@intrbiz.com
Caveats
● Make sure that MTUs are configured correctly and match
○ OSPF is a custom IP type, if your MTU is mismatched packets get corrupted
● Label your cables
○ Swapping cables around will break things
● Quagga will only set a default route if no default route is already defined
○ OSPFd needs: `default-information originate metric-type 1`
24. http://intrbiz.comchris@intrbiz.com
Further Reading
● Intro to Clos networks
○ https://en.wikipedia.org/wiki/Clos_network
● Google white paper on their CLOS topologies
○ https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43837.pdf
● Cumulus on Clos and ECMP:
○ https://cumulusnetworks.com/blog/celebrating-ecmp-part-one/
● Benefits of ditching layer 2
○ https://thenewstack.io/ditch-pitfalls-layer-2-networks-modern-data-center-design/