A client of mine is in the server-hosting (bandwidth selling as he calls it) business, and as such he has a lot of public IP addresses attached to servers that he doesn’t directly manage. These servers are sometimes the focus of internet attacks that sometimes have the ability to eclipse the legitimate traffic of his entire business. My client brought me in to advise on ways to mitigate the risks of this.
A Simple Model
The first time I spoke with this client, he had a very simple network model.
The network provider acted as the default gateway for several subnets, and all networking gear onsite was Layer 2 only. This created some interesting failure modes, particularly in a Denial of Service event. The scenario that brought me on board was when my client disabled the victim in an attacked (I think he unplugged the network from that server) but this caused a flood of traffic to all ports on the network. This was a natural reaction of the networking design that was in place at the time; when my client unplugged that host its MAC address was forgotten by the switches (a 10 minute timer) and switches WILL flood unicast traffic to all ports when they do not know the destination interface for a particular MAC address.
To solve this problem, on the aggregation switch I created a Layer 2 ACL that would drop all traffic for this MAC address. Once this was in place the traffic would still come across the network link to the aggregation switch, but it would be dropped there and this traffic would not be forwarded to all ports.
Moving to a Layer 3 Solution
After some deliberation with my client, and discussion of his long-term plans I recommended that he take over the routing for his own network.
In this configuration we installed two 3750 switches, and arranged with the provider to have a small routed network between the 3750s and their network. They setup static routes for my client’s networks, and I just configured a default route back to the provider. To provide additional availability, we stacked the 3750s, and setup LACP bundles to each of the Server Switches. I had my client setup a server that monitored all traffic with NTOP, this allowed us to see exactly what IP was being attacked so my client could take steps to resolve the issue.
The steps at this point were to write a Layer 3 ACL to drop traffic for that IP, and contact the network provider to ask them to drop traffic for this IP for 24 hours. The idea here was to minimize the impact of the attack on the rest of the network, because the DDoS attacks were completely saturating my client’s network connection. I considered writing QoS templates to use in the event of an attack, but the variety of incoming attacks and the knowledge level of my client make this unworkable; the problem here wasn’t how to rate-limit the traffic, but more of how to identify which rate-limiting mechanism to put into place at the right moment.
Unfortunately the frequency and severity of the DDoS have increased. My client’s network provider has stated they are unable to offer any additional active DDoS protection, and in fact the provider was becoming frustrated with handling my client’s requests to block incoming traffic.
Private BGP and Blackhole Communities
This led us to the current solution, which was to enact a private BGP relationship with the provider. Here we advertise the routes that are active on my client’s network, but my client has the ability to tag particular routes with a community identifier which the provider uses to indicate a route to Null; essentially my client can enforce a blackhole route on the provider’s network without having to call the provider, and the blackhole route is effective almost immediately. The advantage here is that the provider is not annoyed with my client, and he has much clearer control over the networks that are blocked.
Here you can see that we have specified a BGP AS in the private AS range (64512 — 65535). With this model, all my client has to do is add a route with this tag, and it will be blocked by the provider; saving his bandwidth costs and saving the rest of his customers from a bad networking experience.
Showing the Status of the Blackhole Community
Once you have blocked a route, you can verify that it was taken up in BGP like this: