As I discussed in the last tip, MPLS has become a trusted transport option for convergent networks that require guaranteed levels of service. The major carriers supporting MPLS services have built out robust transport services using this technology. Unfortunately, convergent technologies require a much higher level of availability than the typical data protocols. In most cases, this means increased levels of redundancy in an organization to support failover.
Most large organizations already have redundant links into their data centers and some form of backup link (ISDN and DSL) at the remote sites, so the concept of high availability is not foreign to them. What may be foreign is how to enable high availability in the new MPLS world.
Let's refresh on some MPLS terms before beginning a discussion about how to enable high availability and load sharing utilizing MPLS transport:
- Customer edge (CE) router: This is the router that sits in all of your sites and provides connectivity to the WAN.
- Provider edge (PE) router: This is the router that sits in the provider's central office and that the CE connects to.
- Local loop: This is the circuit that connects the CE router(s) to the PE router(s). These can be Frame, ATM, Ethernet, Private Line and others.
The CE router and the PE router communicate via IP over the last-mile local loop. High availability is enabled via the physical layer.
In order to facilitate high availability, a dynamic routing protocol must be run between the CE and PE router; the CE and PE routers must be redundant as well. The recommended protocol to run between the CE and PE routers is the Border Gateway Protocol (BGP). OSPF can provide high availability routing, but BGP is more effective when it comes to load sharing. The CE and the PE run eGP, and the two CEs can run iBGP between them. The CE routers are the most important in this discussion because the CE router configurations are done mostly by the customers of MPLS services.
In order to enable high availability, both the links to the PE must be up and forwarding traffic, and acting as backup links to each other. Standard dual-homed BGP routing best practices can facilitate the high availability. The mechanisms for configuring dual-homed, redundant BGP neighbors are outlined very well in this article. I don't want to spend time on discussing ISP dual homing.
The principles are the same, in that BGP is used. The key is that most of the carriers will allow customers to do both egress and ingress load sharing.
- Egress load sharing is defined as traffic directed toward the carrier: CE to PE
- Ingress load sharing is defined as traffic directed toward the enterprise: PE to CE
Ingress load sharing requires manipulation of the provider's routing decisions across the multiple links from the PE to the CE. This is done via manipulation of the BGP attributes. Ingress load sharing can be enabled in a variety of ways, but in all cases, the provider must recognize and support the BGP attributes required to support load sharing. Below are the BGP attributes for manipulating ingress traffic flows in a dual-homed, high-availability environment:
- AS-PATH Prepend: The CE routers prepend the AS path BGP attribute of the routes advertised to the PEs. This can cause the PE routers to choose different paths from the one that might have been chosen using the derived AS-PATH. The carrier must support and recognize the prepended path.
- Multi-Exit Discriminator (MED): The CE can set the MED BGP attribute on each advertised route. The PE uses the MED to determine the path back to the CE.
- Local Preference: CE can set different local preference attributes that are sent to the PE. The PE will act on these going back to the CE. Most carriers will publish the local preferences that they will acknowledge.
It is important to understand that the CE owner must set the parameters that the carriers will act upon.
Egress load sharing is most commonly done with the local preference command, with future support for BGP multipath. The best bet is to begin by providing round-robin local preference on each egress CE for the remote site prefixes.
The keys to high availability and load sharing are not only the physical connections between the CE and PE but the routing protocols and the routing policies that are used between them. In fact, this represents a great deal of the complexity of deploying a high-availability solution capable of load sharing.
About the author:
Robbie Harrell (CCIE#3873) is the National Practice Lead for Advanced Infrastructure Solutions for AT&T. He has more than 10 years of experience providing strategic, business and technical consulting services. Robbie lives in Atlanta and is a graduate of Clemson University. His background includes positions as a principal architect at International Network Services, Lucent, Frontway and Callisma.