Anyone who knows Border Gateway Protocol (BGP) already knows it's hard, but the secure and stable IP network results make it worth it. You may have worked your way from simple through advanced troubleshooting, but there's more. This essential BGP guide dives deep into BGP IP network design and challenges that present themselves the more you work with it. Check out using BGP in a large network design, getting better scale with MPLS in the core and BGP on the edge, and upping network security so customers can't accidentally harm your BGP routing data (as well as the Internet at large).
Five essential reasons for BGP in your network
Yes, Border Gateway Protocol (BGP) has the reputation of being the hardest routing protocol to design, configure and maintain. But while this notion has some validity, there are situations where BGP is the only tool available to get the job done, or where deploying BGP throughout your network can increase its security or stability.
BGP's complexity is primarily due to the large number of attributes it can attach to a route, its complex route selection rules, and the manual configuration of neighboring routers (which are discovered automatically in most other routing protocols) needed to ensure the security of the routing information exchange. Having a large number of configuration options and BGP-specific filtering mechanisms available on routers from different major vendors doesn't help either.
In this guide, I'll give you five scenarios where BGP is the best match for your network requirements.
- Internet service advantages
If you're an Internet service provider (ISP), running BGP in your network is almost a must. I've seen consumer-focused ISPs that tried to get around this recommendation and used BGP solely to peer with their upstream ISPs, but they eventually had to bite the bullet and deploy BGP to increase the stability of their network, provide end-to-end quality-of-service or penetrate enterprise markets. Enterprise-focused ISPs have to run BGP from the start to support their multi-homed customers).
- Layer 3 VPN services
We've seen a variety of technologies used to implement Layer 3 VPN services in recent years, and MPLS-based VPNs have undoubtedly proven to be the most scalable solution, partly due to using BGP as the underlying routing protocol. Fortunately, you don't have to deploy BGP everywhere in your network if you want to deploy MPLS/VPN solutions. It's enough to deploy BGP on the Provider Edge (PE) routers that connect your VPN customers and on a few core devices that act as route servers (these devices should not be expected to forward heavy traffic loads).
- Increasing network stability
Although I've met networking engineers trying to use BGP as the sole routing protocol in their networks, that's not how you should use it. Any decent BGP design should rely on another faster routing protocol (for example, OSPF, EIGRP or IS-IS) to provide core routing in the network, with BGP responsible for the edge/customer routing.
With the separation of core and edge routing into two routing protocols, your network core becomes more stable, as the edge problems cannot disrupt the core. This design has been used very successfully in large enterprise networks with haphazard addressing schemes that defied attempts at route summarization. It should also be used in almost all service provider environments. You should never carry your customers' routes in your core routing protocol, as customer's internal problems could quickly affect the stability of your own network.
I must note that it's amazing what you can see in the field. I saw an ISP running OSPF with its customers a few years ago. In that setup, a rogue or ignorant customer could have easily disrupted the whole service provider network.
- Automatic response to denial-of-service attacks
Among other peculiarities, BGP allows you to specify any IP address as the next-hop for an IP prefix. This property is most-often used to ensure optimum routing across a BGP autonomous system. You can also use it to implement network-wide sinkholes and remote blackholes to quickly stop worms and denial-of-service attacks on your network.
Please note that you don't have to migrate your routing to BGP if you want to use these mechanisms. To implement remote blackholes, it's enough that you deploy BGP on strategic points in your network and link them via BGP sessions with a central router through which you'll insert the IP addresses to block.
- Large-scale QOS or web caching deployment
Not only does BGP carry a number of attributes describing the IP routes, it allows you to add extra baggage to every IP route it advertises in the form of BGP communities that are totally transparent to BGP (unless you're manually configuring route selection rules to use them) but propagated throughout the network.
A few technologies completely unrelated to BGP allow you to use these attributes to implement large-scale designs. For example, Quality-of-Service Policy Propagation with BGP (QPPB) allows you to set QoS bits for specific BGP destinations based on BGP communities and other BGP attributes. Similarly, you can control the Web Cache Communication Protocol (WCCP)-based web caching policy with BGP.
Even though BGP is categorized as a complex and hard-to-configure routing protocol, its deployment in large enterprise networks can bring significant benefits, which is almost mandatory in a service provider environment.
Designing large-scale BGP networks
Considering the relative complexity of Border Gateway Protocol (BGP), it's not surprising that you would consider various design aspects before rushing head-on into implementing it in your network. If nothing else, a good design and careful planning you could save you a few tense troubleshooting sessions.
In this article, I'll try to give you a few generic guidelines that you should follow when designing your BGP network. Don't forget that experience comes only with practice, however. When designing your first few BGP networks, you should get expert help, either in-house, from your vendor or from a qualified professional services organization.
Use a public autonomous system number
BGP uses autonomous system (AS) numbers to track networks through which the traffic would have to pass to reach the final destination. AS numbers visible in the public Internet have to be globally unique and are allocated by various Internet registries. If you want to offer public Internet services, having a public AS number is mandatory. If you are in hurry and just need BGP to offer other IP-based services (for example, Layer 3 VPN services based on MPLS VPN), you could use a private AS numbers specified in RFC 1930 (AS 64512 through AS 65535), but then you might be faced with challenging migration scenarios if you'd ever want to offer public Internet services.
Use BGP only in combination with another routing protocol
BGP was designed to be a robust, conservative routing protocol able to carry hundreds of thousands of IP prefixes. It was never meant to be a fast-converging protocol needed to implement modern IP-based services (for example, Voice-over-IP or Triple Play services). You should always use BGP on top of a modern, fast-converging Interior Routing Protocol (IGP), for example OSPF or IS-IS. In such a design, the IGP provides optimum paths through the network core and BGP provides edge-to-edge routing across these paths.
Run internal BGP between loopback interfaces
BGP uses TCP as a reliable transport to exchange routing information between manually configured BGP peers (there is no neighbor discovery in BGP). TCP is always tied to a pair of local and remote IP addresses. Should any one of these become unreachable, the TCP session and consequently BGP routing would become disrupted even though the routers are still operational.
Internal BGP sessions (BGP session between routers in your network) should thus always be run between loopback interfaces, ensuring that the TCP session stays operational as long as there is at least one path between the BGP neighbors (even though the physical interfaces through which the neighbors are reached might change).
External BGP neighbors are usually directly connected (your BGP router is directly attached to your customer's or peering partner's BGP router). The external BGP sessions are thus commonly run between adjacent IP addresses assigned to physical interfaces.
Run BGP in the whole network
Historically, some service providers tried to avoid running BGP in the whole network to reduce the memory requirements and CPU utilization of their routers, relying on ingenious designs that inevitably became too complex once their networks started to grow. It's best to accept the fact that BGP is inevitable in a serious service provider network and design the whole network for it from the very start.
Obviously, you don't need to run BGP on every router in your network. For example, dial-up servers or DSL concentrators can rely on default routing supplied by the network core, but the edge routers connecting enterprise customers could already need BGP to cater to the needs of the multihomed customers.
Statically configure advertised prefixes
If you're offering public Internet services, you have to advertise public IP address space assigned to you via various Internet registries into BGP. Sometimes the engineers try to reach this goal through a complex process of route redistribution from IGP into BGP and subsequent route aggregation within BGP. It's much simpler to advertise the exact prefixes you've been allocated on a few key BGP routers.
When you decide to split the routing of your Internet customers from your core routing (highly recommended) and carry customer IP prefixes in BGP, they could be redistributed from IGP (or from static routes on the edge routers), but tagged with the well-known NO_EXPORT community to prevent their propagation into adjacent autonomous systems.
NOTE: Different rules apply when you run BGP in MPLS VPN environments, where two-way redistribution between BGP and customer's IGP is very common.
Do not change BGP attributes within your network
Any routing protocol (BGP included) works best if all routers in the network have a consistent view of the network. To ensure the consistent routing in your network, do not change any BGP attributes on updates sent to IBGP neighbors (most router vendors would allow you to do that). On the other hand, it's OK to change BGP attributes on:
- Routes received from external BGP neighbors. Most commonly, the local preference attribute is set to indicate preferred/backup exit points.
- Routes redistributed into BGP from other sources. Some BGP attributes (for example, Multi-Exit Discriminator) are set automatically, others can be set on the redistributing router.
Redistribute external subnets into your IGP
Each IP prefix carried by BGP has a next hop attribute, specifying the IP address of the next-hop BGP router. It's the job of the IGP to figure out the optimum path toward the next hop.
By default, BGP advertises IP prefixes received from an external neighbor (from your peering partner, for example) with the next hop attribute pointing to the IP address of the external peer. This property allows you to implement perfect load sharing toward those Internet Exchange Points (IXPs) where you have deployed multiple routers for redundancy purposes. However, the external IP addresses advertised as the next hop by BGP have to be reachable; you should redistribute them into your IGP. Failure to do so might result in interesting troubleshooting exercises.
Note: If you haven't deployed multiple routers connected to the same IXP, you could also use an alternate design, where your edge BGP router resets the next hop attribute to point to its own loopback address.
Use BGP route reflectors
Due to BGP loop avoidance rules, an IP prefix received from an internal BGP peer should not be advertised to another internal peer. Consequently, every BGP-speaking router in your autonomous system should have a BGP session with every other BGP-speaking router in your network. Obviously, the overhead of such scheme in large service provider networks is enormous and tools have been developed years ago to make internal BGP scalable.
There are two approaches to scalable internal BGP: BGP route reflectors and BGP confederations. Confederations are rarely used; most designs use BGP route reflectors.
BGP route reflector (RR) is a BGP router that is allowed to propagate IP prefixes between internal BGP neighbors (additional BGP attributes are used to detect loops). The route reflectors could be connected in a hierarchy; for example, a regional route reflector might be a client of a core route reflector. The hierarchy should not have too many levels, as each level introduces additional delay in the BGP convergence process.
You could use regular routers as BGP route reflectors with a low number of clients. In large networks, the core route reflectors should be dedicated devices that are not forwarding significant amount of traffic.
For example, the distribution-layer routers connecting your Points-of-Presence to the network core could act as BGP RR for the BGP routers in the POP. The core route reflectors would then be dedicated boxes distributing BGP routes to all core- and distribution-layer routers.
Use peer templates
Most router vendors allow you to configure a large number of options controlling BGP behavior toward individual BGP neighbors or per-neighbor inbound/outbound filtering policies. Keeping these settings consistent in an environment with a large number of BGP neighbors is a management nightmare. You can easily avoid it if you use configuration scalability tools (commonly called peer groups and peer templates).
While BGP is undoubtedly a complex routing protocol, you can design reliable large-scale BGP networks based on well-known best practices and design guidelines including these:
- If at all possible, get a public AS number and use it.
- Run BGP throughout your network, at least on all of your core routers (unless you've deployed MPLS, in which case this is no longer a requirement).
- Scale your network with BGP route reflectors and peer templates.
- Always run BGP in combination with a fast IGP. Establish IBGP sessions between router's loopback interfaces.
- Do not redistribute/aggregate routes into public Internet. Use static IP prefix origination.
Scale your backbone with core MPLS, BGP on the edge
If you want to deploy ) Border Gateway Protocol (BGP) throughout your network, you have to run it on all core routers (and there are a number of reasons why you should)… or at least that was the traditional wisdom.
With the introduction of MPLS, you can run BGP only on the network's edges, reducing the memory requirements and CPU load on your core routers, while at the same time making them more stable.
To understand why MPLS technology has such an impact on your network, let's review the basic facts of BGP routing. When BGP advertises a route between routers in the same network -- the same Autonomous System (AS), the next-hop of the route remains an IP address outside of the AS, as shown in the diagram below.
Note: Most other routing protocols make the next hop of the route the IP address of the adjacent router.
Consequently, when the routing tables are built on the routers in your autonomous system, all entries for IP prefix 10.1.2.0 point to the same next hop: the IP address of the X1 router (see diagram below).
If a Label Switch Path (LSP) were established between the routers' POP and Internet Exchange Point (IXP) for the IP destination X1, the packets toward the network 10.1.2.0 would travel across the network encapsulated in MPLS headers, and the core router would not need to have the BGP route toward the destination network (see diagram below).
The LSPs for all non-BGP destinations are built automatically once you enable MPLS with Label Distribution Protocol (LDP) in your network (unless you've configured LDP filters). The LSP between the POP and the IXP router is thus created automatically, and the POP router starts using it to send packets toward the IP network 10.1.2.0 as soon as it's created. BGP is thus no longer needed on the core router, as it never receives a non-MPLS-encapsulated IP packet for the network 10.1.2.0.
Once you decide to rely on MPLS to provide the edge-to-edge transport across you network core, however, BGP has to be deployed on all edge routers (similar to the MPLS VPN designs). You can no longer use default routing toward an IXP or toward your network core, as your core routers cannot forward IP packets toward Internet destinations anymore. If you would like to retain default routing on the low-end access routers, you could use the following design:
- The core routers run only MPLS and core IGP. These routers should never have to forward non-labeled IP packets toward external destinations. The only IP traffic they should handle is the routing protocol updates and network management queries.
- The distribution layer routers run BGP and provide end-to-end transport across label switched paths established in the network core.
- The distribution layer routers advertise default route toward those access routers that do not run BGP.
- Access routers might have full BGP routing table (needed for multi-homed customer), partial BGP routing table (for example, only the routes toward your customers) or no BGP at all (in which case they would use the default route toward the closes distribution layer router for most of the traffic).
This design is very similar to IP-over-ATM designs used in early high-speed Internet backbones (when ATM was the only high-speed technology available). The only difference is in the backbone infrastructure, where ATM switches have been replaced with routers, significantly reducing per-port and per-switched-Gbps costs.
Before you rush to reconfigure your routers and remove BGP from your core, you have to consider the following caveat: The MPLS-only network core will perform its duties only if the LSPs established across the core with LDP always follow the shortest paths computed by the IP routing protocols. If a backbone router is restarted and becomes a hop on the shortest path across the core network before it has exchanged the LDP labels with its neighbors, the LSPs across the network will break and the transit traffic will be blackholed.
To remove the risk of broken core LSPs, you could deploy MPLS Traffic Engineering between the distribution-layer routers. If you enable MPLS TE in your network, the routers prefer MPLS TE paths over paths computed by your routing protocol) or you could configure slow IGP startup on your backbone routers (available only if you use OSPF or IS-IS as your core routing protocol).
Improving BGP services and security
In the previous sections, I've described how you can build large, scalable service provider networks using the Border Gateway Protocol routing protocol. But even the best-designed network will fail to meet your business goals unless it offers the services your customers want and is protected from active attacks by malicious users, as well as errors made by your peers or customers.
In this section, you'll find guidelines that will make your network more resilient and easier to maintain.
Plan for advanced services
The service provider industry entered the commodity phase a long time ago, with cost-based competition one of the major focuses. Nonetheless, there are still customers that would appreciate extra services you could offer them (for example, a comprehensive backup link implementation). These services are no longer leading-edge technology, as some major players have been offering them for years. But it still amazes me that you can find smaller service providers that simply cannot implement them properly.
Before rushing to your web browser trying to figure out what services to offer (based on what the competition is doing), it's worthwhile analyzing your particular market:
- Who are your target customers?
- What business goals are they trying to achieve by using your services?
- Which services should you offer to help them reach their goals?
Only after the analysis should you focus on designing the implementation of those services in your network.
Use BGP communities
There are two ways to implement additional services for your customers: You could plan them in advance and implement them network-wide, or you could implement a specific solution (most often a kludge) for each customer request. Obviously, the second approach results in a network that is hard to maintain and troubleshoot (more so if the implementation of a customer-specific solution is not followed by thorough documentation).
If you want to implement a scalable approach to BGP-related network services (for example, implementation of backup links or specific quality-of-service requirements), you should introduce BGP communities into your network. You can use them to mark customer routes as they enter your network and implement various policies based on communities attached to the customer routes at other places in the network. For example, you could use BGP communities to implement QoS marking of ingress traffic at the Internet Exchange Points.
Sometimes you would want to control the BGP communities yourself (you probably don't want the customers to select premium QoS parameters without your control), or you could help the customers use the communities you have defined to select the services they need (resulting in a self-service approach that minimizes your costs).
Protect your network
You cannot offer reliable services to your customers if you're vulnerable to outside denial-of-service (DOS) attacks. To stop the majority of the brute-force attacks, you should implement inbound access-lists on all interfaces attached to external sources (your customers or peer networks). These access-lists should:
- Block all network control protocols that you don't plan to offer as a service. At the very minimum, you should not accept any routing protocol updates through these interfaces (apart from BGP).
- Block all traffic with source addresses in the private address space.
- Block all traffic from bogon prefixes.
- Block all traffic originating from the IP addresses belonging to your core network.
Note: To simplify the filters, allocate the IP address of your core devices from a continuous block of IP address space.
The access-lists deployed on peering interfaces should also block all traffic originating from IP prefixes allocated to you by the Internet registries. On the customer-facing interfaces, you should also implement the unicast Reverse Path Forwarding.
Guard your routing protocols
While it's important to stop the denial-of-service attacks from spoofed source IP addresses, it's even more important to protect your routing protocols. No traffic filter will help you if your routers have an invalid view of the network.
Ideally, the only routing protocol you would use on the external connections (with your peers and your customers) would be BGP. If you have to run any other routing protocol with your customers, accept only distance-vector protocols (RIP or EIGRP), as the technology they're using allows you to filter incoming updates. If you run a link-state protocol (OSPF or IS-IS) with your customer, you've lost all control of the stability of your network.
If at all possible, implement digital signatures in the routing protocol you're using with external neighbors to prevent route spoofing. Major router vendors support MD5 signatures in all distance-vector protocols.
Last but not least, filter all inbound routing updates. Your peers should not advertise private IP prefixes or prefixes belonging to you. Very small IP prefixes (less than /24) usually indicate a configuration error or an active attack. You should discard them unless you have special arrangements with your customers.
Trust no one…especially your customers
It sounds ridiculous, but your customers can cause you more headaches than your competitors, as some of them simply don't have the experience necessary to configure their devices in a complex (usually multi-homed) scenario. As far as possible, you should help them by protecting them from their mistakes.
For example, do not accept transit BGP routes (routes with multiple different autonomous system numbers) or very small IP prefixes (less than /24 or /25) from your customers. If you want to implement even tighter control, you can also filter the IP prefixes they're announcing based on the contract you have with them.
Unless you have extremely good reason not to, you should limit the number of IP prefixes that you're willing to accept from your customers. Routers from major vendors allow you to disconnect a BGP neighbor for a configurable amount of time (or even shut it down completely) if it sends too many prefixes.
Malicious customers might also try to get free extra services by attaching numerous BGP communities to their routes. While you should accept BGP communities that implement self-service options (like backup link selection), you should remove those that you use to implement value-added services.
About the Author
Ivan Pepelnjak, CCIE No. 1354, is a 25-year veteran of the networking industry. He has more than 10 years of experience in designing, installing, troubleshooting and operating large service provider and enterprise WAN and LAN networks and is currently chief technology advisor at NIL Data Communications, focusing on advanced IP-based networks and web technologies. His books include MPLS and VPN Architectures and EIGRP Network Design. Check out his blog.
Efficient distribution network design can help enterprises save