When you work in managed services you often can find yourself looking at networks that have some unusual traits. You effectively adopt these networks and will find that sometimes these traits can sneak up on you. This is the tale of such a network and what I can only describe as a BGP BlackHole.
The network in question can be simplified for the purposes of this blog, in addition we will look at in one direction. We have…
Two campus sites “North” and “South”,
Two Data Centre sites “East” and “West”.
Both campus sites connect to Data Centre sites via separate MPLS networks. And between them all… a nifflers pocket full of BGP configuration. Again to make this simpler, there is no issues with the Northern networks and sites.
Traffic between North and South should always prefer the route via the West, this is controlled through a combination of local preference and AS paths. Interestingly however, this is controlled at the MPLS CPEs, not within the core East/West Data Centre devices.
In the North, there is the Network 10.10.10.0/24. The path to this network propagates down through various BGP peers, through the North MPLS cloud, through both East and West DCs, through the South MPLS cloud and finally into South.
All traffic routes via West (as controlled by the Southern MPLS). In the event that the West side of the network goes down, all traffic fails over to the East (as it should). Everyone is Happy. No one really notices.
But once the West side returns to service, all traffic to the destination network starts to FAIL!
Why?
But why? All traffic should return to the West, after all this is the preferred route via local preference. If it doesn't fail over, at the very least all traffic should stay travelling East, surely there shouldnt be a problem.
To understand why this happens. Let’s focus on the southern part of the network and the route propagation, during a West side failure.
"Normal" BGP Operation (Business as usual)
When the West Core learns of the route via the north, it pushes this route out, to the south MPLS where a local weight preference is added (and an AS overwrite).
The route then is pushed to all other southern MPLS devices, and IMPORTANTLY gets advertised up to the East Core. Due to the AS overwrite that occurred earlier, the East Core now believes the best route to the north is via the south!
(The diagram below shows the propagation of the route 10.10.10.0/24. To understand how traffic would flow towards this network simply invert the arrows direction)
Because (and be prepared to read this sentence multiple times) the East core is now receiving a route for 10.10.10.0/24 from both the North and South. BGP’s natural loop prevention kicks in. The West core WILL NOT advertise a network out of an interface from which it has learned that same network. Which should hopefully bring back memories of switches and their broadcasts / flood behaviour, where a frame is "forwarded out of every interface except the one it was received on".
The West fails…
Suddenly, the valid route to the northern network is lost. In that moment the Eastern core starts pushing the route it knows into the southern MPLS. This propagates as expected, and traffic all fails over with a minor blip at most.
So far, no one has noticed an issue and the network has remained up without an issue.
The West Returns...
The West returns, it picks up this route this new route from the East. And again, loop prevention kicks in. This means the Western core is not able to advertise the northern route into the southern MPLS. In turn the Western path does not exist in the south and therefor cannot have the local preference applied.
The fundamental failure here now becomes asynchronous routing.
Traffic from the north, traverses via the West, and returning traffic via the East. Traffic from the south, traverses via the East, and returning traffic via the West.
This is the blackhole. Not just traffic getting lost but valid routes also. Changes in the north having large impacts on southern routing. Traffic getting lost and routes changing unexpectedly. This is a situation you dont want to be in, where deterministic routing is lost.
The Fix?
A hard reset of the East BGP cores to the southern MPLS peers. Will force the routes from the East to be dropped allowing the West to inject its valid routes back in to the Southern Network.
The resolve?
Multiple options exist, rules to prevent certain networks being advertised back in to east/west cores. Bring routing decisions into the cores. Permit Async routing, and more… Take your pick.
This has honestly been on of the most entertaining scenarios to troubleshoot. What is the strangest network you have adopted? Leave your answer in the comments.
Comments