BGP Best Practices

ACLs on the Internet facing interface

You should permit BGP only with known peers, to prevent malicious entities attempting to harm your BGP process by spoofing your neighbor’s IP.

You should permit inbound traffic only to your prefixes. There’s no sense accepting traffic for networks you don’t own.

You should deny inbound traffic from bogons.

MD5 passphrase with peers
ttl-security with peers (especially if you’re doing MD5)

Internet good naturedness
Control outbound advertisements
Use prefix lists to ensure that you only advertise your prefixes
Control inbound traffic: http://www.bcp38.info/index.php/Main_Page
Route RFC1918 traffic to null
Use URPF to ensure your outbound traffic isn’t spoofed
Disable NTP on Internet facing interface

Things I know about BGP

Full routes
Currently the IPv4 Internet is at about 515k prefixes and has been growing at about 15% each year for the last few years. That means in 5 years the IPv4 Internet may contain more than 1 million prefixes.

Currently the IPv6 Internet is at about 21k prefixes and has been growing at about 30% each year (on average). That means in 5 years the IPv6 Internet may contain more than 75k prefixes

Any router that receives full routes must hold them in memory, so if you want to accept full routes you have to watch your memory footprint to make sure that your platform can handle it.

Partial routes
are when you accepting a fraction of the full routing table. This can be some subnet of full routes, or just a default route. Your carrier can filter routes for you or you can filter them yourself.

When you’re using a default route your carrier can advertise this to you or you can just use a static route, but at least a carrier advertised default will disappear if the BGP session goes down.

Even with partial routes you can still control inbound and outbound traffic paths for your prefixes, but the limitation is that your router cannot make best path selection on prefixes that it doesn’t know about. This means that if your upstream carrier has a problem (maybe they lose their own upstream providers?) then your own routes may not reflect this and your traffic may get dropped.

Configuring Partial routes
This configuration shows how to limit learned prefixes to those on your upstream ASN +1. That means you’ll learn routes that are part of your upstream carrier’s ASN, plus any routes of their directly connected neighboring ASNs.

ip as-path access-list 1 permit ^65533_[0-9]*$
!
router bgp 65534
neighbor 10.3.128.1 filter-list 1 in

Before
sh ip bgp | begin Network
Network Next Hop Metric Weight Path
* 10.30.30.30/32 10.3.128.1 0 65533 65531 65530
*> 10.1.128.1 0 65532 65530 ?

After
sh ip bgp | begin Network
Network Next Hop Metric Weight Path
*> 10.30.30.30/32 10.1.128.1 0 65532 65530 ?

Why do we want to manipulate traffic?
Sometimes you may have circuits with a cost difference so load balancing isn’t sensible. Some networks have better peering so your customers are closer over that link. Some networks just have better performance or latency.

Or it might just be as simple as you want to push traffic to another circuit for maintenance – if you need to reload some hardware or if your carrier has a planned outage.

Controlling outbound traffic with local preference
Local preference is a blunt instrument, it simply alters the preference for all prefixes learned from a peer. The default local preference is 100, and the higher local preference wins. This is very useful for setting a backup peer.

route-map rm-bgp-localpref permit 10
set local-preference 500
!
router bgp 65534
Neighbor 10.1.128.1 route-map rm-bgp-localpref in

Verification
sh ip bgp | begin Network
Network Next Hop Metric LocPrf Path
* 10.31.31.31/32 10.3.128.1 65533 65531 ?
*> 10.1.128.1 0 500 65532 65531 ?

Controlling outbound traffic with weight
Weight is a fine tool as it can be applied per-prefix (with ACLs). The default weight is 0, and the higher weight wins. This is useful for directing particular flows of traffic over particular paths.

ip access-list standard acl-bgp-weight
permit 10.33.255.255
!
route-map rm-bgp-weight permit 10
match ip address acl-bgp-weight
set weight 100
continue
route-map rm-bgp-weight permit 20
!
router bgp 65534
neighbor 10.3.128.1 route-map rm-bgp-weight in

Verification
sh ip bgp | begin Network
Network Next Hop LocPrf Weight Path
*> 10.33.33.33/32 10.1.128.1 500 0 65532 65531 65533 ?
* 10.3.128.1 0 65533 ?
* 10.33.255.255/32 10.1.128.1 500 0 65532 65531 65533 ?
*> 10.3.128.1 100 65533 ?

Controlling inbound traffic with AS_PATH prepending

A blunt instrument, you make your advertisements look further away on one circuit compared to another. The effect is that routers that can see both paths will prefer the shorter one, encouraging traffic to use the shorter path. Even though this can be applied per-prefix, it is still a blunt too because sometimes the prepended path is still the best one.

ip prefix-list pfl-bgp-prepend seq 10 permit 10.34.255.255/32
!
route-map rm-bgp-prepend permit 10
match ip address prefix-list pfl-bgp-prepend
set as-path prepend 65534 65534 65534 65534 65534 65534 65534 65534 65534 65534
!
router bgp 65534
Neighbor 10.3.128.1 route-map rm-bgp-prepend out

There are no tools on the local router to show this, so you have to use a BGP looking glass to validate:
http://lg.peer1.net
http://lg.he.net
Note that looking glass sites far from you will probably not see your prepends, as BGP routers only share their best path with each other – so your prepended path probably won’t make it to the other side of the planet.

Controlling inbound traffic with MED (multi-exit discriminator)
A fine tool, very effective per-prefix control but it only applies when peering to a single AS with multiple circuits. This isn’t a very common configuration.

MED a peer to use a particular circuit for some (or all) of your advertised prefixes, but sometimes metrics are filtered by peers so you must work with them to make sure it is supported.

ip access-list standard acl-bgp-med
permit 10.10.10.0 0.0.0.255
!
route-map rm-bgp-med permit 10
match ip address name acl-bgp-med
set metric 200
!
router bgp 65534
neighbor 192.0.2.1 route-map rm-bgp-med out

There are no tools on the local router to show this, so you have to work with your peer to validate this.

Blackhole incoming DDoS using BGP
This is a local administrator activated mechanism, whereby you can use BGP to indicate to your upstream provider to null route traffic for a particular prefix of yours. This ensures the DDoS target is offline but saves the rest of the network, and you don’t pay for bandwidth for incoming DDoS. This must be arranged with your upstream provider.

ip route 10.34.254.254 255.255.255.255 Null0 tag 111
!
route-map rm-bgp-blackhole permit 10
match tag 111
set community 65534:666
!
router bgp 65534
redistribute static route-map rm-bgp-blackhole
neighbor 10.1.128.1 send-community
neighbor 10.3.128.1 send-community
!
ip bgp new-format

Verification
show ip bgp community | begin Network
Network Next Hop Metric LocPrf Weight Path
*> 10.34.255.255/32 0.0.0.0 0 32768 ?

BGP Blackhole Community

A client of mine is in the server-hosting (bandwidth selling as he calls it) business, and as such he has a lot of public IP addresses attached to servers that he doesn’t directly manage.  These servers are sometimes the focus of internet attacks that sometimes have the ability to eclipse the legitimate traffic of his entire business.  My client brought me in to advise on ways to mitigate the risks of this.

A Simple Model

The first time I spoke with this client, he had a very simple network model.

The network provider acted as the default gateway for several subnets, and all networking gear onsite was Layer 2 only.  This created some interesting failure modes, particularly in a Denial of Service event.  The scenario that brought me on board was when my client disabled the victim in an attacked (I think he unplugged the network from that server) but this caused a flood of traffic to all ports on the network.  This was a natural reaction of the networking design that was in place at the time; when my client unplugged that host its MAC address was forgotten by the switches (a 10 minute timer) and switches WILL flood unicast traffic to all ports when they do not know the destination interface for a particular MAC address.

To solve this problem, on the aggregation switch I created a Layer 2 ACL that would drop all traffic for this MAC address.  Once this was in place the traffic would still come across the network link to the aggregation switch, but it would be dropped there and this traffic would not be forwarded to all ports.

Moving to a Layer 3 Solution

After some deliberation with my client, and discussion of his long-term plans I recommended that he take over the routing for his own network.

In this configuration we installed two 3750 switches, and arranged with the provider to have a small routed network between the 3750s and their network.  They setup static routes for my client’s networks, and I just configured a default route back to the provider.  To provide additional availability, we stacked the 3750s, and setup LACP bundles to each of the Server Switches.  I had my client setup a server that monitored all traffic with NTOP, this allowed us to see exactly what IP was being attacked so my client could take steps to resolve the issue.

The steps at this point were to write a Layer 3 ACL to drop traffic for that IP, and contact the network provider to ask them to drop traffic for this IP for 24 hours.  The idea here was to minimize the impact of the attack on the rest of the network, because the DDoS attacks were completely saturating my client’s network connection.  I considered writing QoS templates to use in the event of an attack, but the variety of incoming attacks and the knowledge level of my client make this unworkable; the problem here wasn’t how to rate-limit the traffic, but more of how to identify which rate-limiting mechanism to put into place  at the right moment.

Unfortunately the frequency and severity of the DDoS have increased.  My client’s network provider has stated they are unable to offer any additional active DDoS protection, and in fact the provider was becoming frustrated with handling my client’s requests to block incoming traffic.

Private BGP and Blackhole Communities

This led us to the current solution, which was to enact a private BGP relationship with the provider.  Here we advertise the routes that are active on my client’s network, but my client has the ability to tag particular routes with a community identifier which the provider uses to indicate a route to Null; essentially my client can enforce a blackhole route on the provider’s network without having to call the provider, and the blackhole route is effective almost immediately.  The advantage here is that the provider is not annoyed with my client, and he has much clearer control over the networks that are blocked.

[code]]czoxMDkyOlwicm91dGVyIGJncCA2NTUzNQ0Kbm8gc3luY2hyb25pemF0aW9uDQo8c3Ryb25nPiMgdGhpcyBkZWZpbmVzIHRoZSBzb3V7WyYqJl19cmNlLUlQIG9mIG91ciBCR1AgaW5zdGFuY2U8L3N0cm9uZz4NCmJncCByb3V0ZXItaWQgWS5ZLlkuWQ0KYmdwIGxvZy1uZWlnaGJvcntbJiomXX0tY2hhbmdlcw0KcmVkaXN0cmlidXRlIGNvbm5lY3RlZA0KPHN0cm9uZz4jIHdlIHdhbnQgdG8gcmVkaXN0cmlidXRlIHRob3NlIHN0e1smKiZdfWF0aWMgcm91dGVzDQojIHRoYXQgcGFzcyBvdXIgYmxhY2tob2xlIHJvdXRlLW1hcDwvc3Ryb25nPg0KcmVkaXN0cmlidXRlIHN0YXR7WyYqJl19aWMgcm91dGUtbWFwIGJsYWNraG9sZQ0KbmVpZ2hib3IgWC5YLlguWCByZW1vdGUtYXMgWlpaWg0KbmVpZ2hib3IgWC5YLlguWCBwYXtbJiomXX1zc3dvcmQgU09NRS5SQU5ET00uS0VZDQo8c3Ryb25nPiMgdGhpcyBlbnN1cmVzIHRoYXQgQkdQIHdpbGwgc2VuZCBib3RoIHRoZSBBe1smKiZdfVMNCiMgYW5kIHRoZSBjb21tdW5pdHkgaW4gaXRzIHVwZGF0ZXM8L3N0cm9uZz4NCm5laWdoYm9yIFguWC5YLlggc2VuZC1jb21tdW57WyYqJl19aXR5IGJvdGgNCm5vIGF1dG8tc3VtbWFyeQ0KIQ0KPHN0cm9uZz4jIHRoaXMgYWxsb3dzIGEgbW9yZSByZWFkYWJsZSBmb3JtYXQgb3tbJiomXX1mIHRoZSBBUzpDb21tdW5pdHk8L3N0cm9uZz4NCmlwIGJncC1jb21tdW5pdHkgbmV3LWZvcm1hdA0KIQ0Kcm91dGUtbWFwIGJsYWNre1smKiZdfWhvbGUgcGVybWl0IDEwDQo8c3Ryb25nPiMgaWYgd2UgY3JlYXRlIGEgcm91dGUgd2l0aCB0aGUgdGFnICYjODIyMDs5OTkmIzgyNDN7WyYqJl19Ozwvc3Ryb25nPg0KbWF0Y2ggdGFnIDk5OQ0KPHN0cm9uZz4jIHdlIHdpbGwgYXR0YWNoIGFkZGl0aW9uYWwgQkdQIGNvbW11bml0eXtbJiomXX0gdG8gdGhhdCByb3V0ZQ0KIyB0aGlzIGluZGljYXRlcyB0byB0aGUgcHJvdmlkZXIgdGhhdCB3ZSB3YW50IG1hdGNoZWQNCiMgdHJhe1smKiZdfWZmaWMgZHJvcHBlZDwvc3Ryb25nPg0Kc2V0IGNvbW11bml0eSBaWlpaOjk5OSBhZGRpdGl2ZQ0KIQ0KPHN0cm9uZz4jIHRoaXMgaXN7WyYqJl19IGFuIGV4YW1wbGUgb2YgaG93IHRvIGNyZWF0ZSB0aGUgcm91dGUgdG8gYmxvY2sgdHJhZmZpYzwvc3Ryb25nPg0KaXAgcm91dGUgWXtbJiomXX0uWS5ZLlogMjU1LjI1NS4yNTUuMjU1IE51bGwwIHRhZyA5OTlcIjt7WyYqJl19[[/code]

Here you can see that we have specified a BGP AS in the private AS range (64512 — 65535).  With this model, all my client has to do is add a route with this tag, and it will be blocked by the provider; saving his bandwidth costs and saving the rest of his customers from a bad networking experience.

Showing the Status of the Blackhole Community

Once you have blocked a route, you can verify that it was taken up in BGP like this:

[code]]czo0MTI6XCJSb3V0ZXIjPHN0cm9uZz5zaG93IGlwIGJncCBjb21tdW5pdHkgWlpaWjo5OTk8L3N0cm9uZz4NCkJHUCB0YWJsZSB2ZXJ7WyYqJl19c2lvbiBpcyAyOCwgbG9jYWwgcm91dGVyIElEIGlzIFkuWS5ZLlkNClN0YXR1cyBjb2RlczogcyBzdXBwcmVzc2VkLCBkIGRhbXBlZHtbJiomXX0sIGggaGlzdG9yeSwgKiB2YWxpZCwgJmd0OyBiZXN0LCBpIOKAlCBpbnRlcm5hbCwgciBSSUItZmFpbHVyZSwgUyBTdGFsZQ0KT3Jpe1smKiZdfWdpbiBjb2RlczogaSDigJQgSUdQLCBlIOKAlCBFR1AsID8g4oCUIGluY29tcGxldGUNCg0KTmV0d29yayAgICAgICAgICBOZXh0IEh7WyYqJl19b3AgICAgICAgICAgICBNZXRyaWMgTG9jUHJmIFdlaWdodCBQYXRoDQoqJmd0OyBZLlkuWS5aLzMyICAgICAgIDAuMC4wLjAgICAgIHtbJiomXX0gICAgICAgICAgICAgMCAgICAgICAgIDMyNzY4ID8NClJvdXRlciNcIjt7WyYqJl19[[/code]