Hardware and Software for the Network Consultant

I work from home, I am the owner of a corporation that performs consulting work for other companies. I am the sole consultant in this corporation so my organization needs are relatively small, but they’re not minor.

If I didn’t have the software that I use I wouldn’t be able to function. So here is what I use on a daily basis.

Google Hosted Applications

This is how I roll. Hosted Gmail, hosted Calendar, hosted Documents. Those are the three apps that keep me going. To get Google hosted apps, you have to have your own domain.

Getting Your Own Domain Name

I registered my domain with http://10dollar.ca; you can register .ca domains for $10 a year.

It isn’t enough just to have your domain registered, it has to be hosted somewhere. I use a free DNS hosting service called http://zoneedit.com. These guys make their money when you pay for more servers hosting your domain, but it is free for small operations like me. The interface is a little clunky and Internet 1.0 but it works and that’s all I ask of it.

Maybe you’re enough of a server geek to manage your own DNS — I could do this, but I know that I can’t do it with any guarantee of High Availability because I don’t have dual-internet connections, multiple servers, redundant power sources, redundant cooling systems, and on and on.

Hosting a Website

I felt like I had to represent my business on the Internet, with my own website: http://wozney.ca. It isn’t much but it tells people who I am, and how to find me. I use http://godaddy.com to host my site — they’re relatively inexpensive and they aren’t too complicated to use.

Managing the Finances

This one is difficult, because you have to keep working at it and it doesn’t actually generate any revenue. It is in so many ways just another way to waste your time. Nonethless it is a critical part of running your business so just keep reminding yourself that if you don’t do the books you can’t be out there making money.

I use Quickbooks Easystart to manage the books. My accountant says it is more than I need, and he is probably right but I’ve become accustomed to how it works so I’m just going to keep using it.

Have a good accountant. You have to have one — so make sure you get one that you can get along with. I thought when I started this that I just needed someone to push the pencil, but I realized that I spend a few hours a year with this person so it should be a worthwhile few hours. Your accountant should ask lots of questions about how you operate and what you want to get out of your business. My accountant performed my year end report for just under $1k. Some charge more, but as far as I understand nobody charges less — and I like this guy!

Home Networking

I run a lot of network hardware here. I have an ASA firewall, an OpenWRT firewall, two unmanaged switches, a Cisco 2960 8 port GigE switch, a NAS, a linux server, a laser printer, an MFP, two APs, and a Cisco CME running right now. The two devices that really save the day for me are the ASA and the NAS.

ASA Firewall

I use the ASA firewall as you might expect, but the ability to use a VPN client into my own network is a life-saver. I have also configured the webvpn functionality that lets me bounce through the ASA to other remote sites — handy when my clients allow access from my office IP only.

I can also access files on my NAS which is useful if I’m at someone else’s computer (eg: if my laptop is being repaired under warranty — like right now). I use the ASA to create multiple group-policies and the ACL function to allow my clients to access to their files on my NAS; they can’t see each other’s files but their own are download/uploadable.

I can also use the ASA webvpn to tunnel through applications to other hosts; I can tunnel ssh to my linux server, and tunnel RDP to a remote client. And this I can do without bringing up my VPN client — this is very handy.

NAS

The NAS is essential as it provides a secondary repository for my business data. I usually work from my laptop, but I use a program called Unison to keep my local files in sync with the NAS. This way if I lose my laptop (has not happened yet) or if I lose a hard drive (has happened before) I haven’t lost more than a day or two worth of data.

My Work Bag

My Laptop

This is where I work. My physical location can be anywhere but I work on my laptop. I run a flavour of Linux called Ubuntu 8.04. I’m not an Ubuntu fanboy, but it is a solid OS and there is a lot of support for it. When it comes to Linux distros for the desktop I follow the crowd.

  • I use putty extensively but you can use any terminal application that allows you to do serial, telnet, ssh and save sessions.
  • I use TrueCrypt to encrypt my work data on my laptop. That way if the machine is stolen then none of my customers are at risk of being compromised.
  • I use VMWare Server to run a WinXP VM. This allows me to use software that (right now) has no suitable Linux replacement.
  • I use Wireshark to sniff network data.
  • I use Firefox, and IES4Linux if I hit a page that doesn’t support it.

Other Stuff I Carry

  • 10 foot CAT6
  • 3 foot cross-over
  • Cisco console cable
  • 9pin to 9pin console cable
  • Anti-static wrist band
  • Coil notebook
  • Business cards
  • Laptop cable lock
  • Bluetooth Mouse
  • USB memory sticks

Good luck!

High Availability — LAN — NIC Bundling

The parent article on High Availability.

Switching on a LAN provides some of the most basic network connectivity options, and are often overlooked. Nonetheless most switches (Cisco, HP, Dell and others) support these configurations, but one thing I can guarantee is that you will find limitations on pretty much every platform. If you’re after inter-operability, do your testing so you can understand these limitations.

Bundle Network Links

We want to bundle network links for two reasons; to aggregate bandwidth (two links give twice the packet-passing capacity) and for failover (if one link fails a second is still running).

LACP

I discussed LACP in an earlier article, but I would like to go into a little more detail here. Make sure you review Cisco’s documentation on configuring LACP, and the Wikipedia article on link aggregation.

In my experience, I find LACP to be the best solution for link aggregation. It is a common protocol so interoperability between devices is almost always possible and the configuration is sensible enough that you can explain it to a lay-person.

In the example above, we have bundled two physical links into a single logical link between two switches.

LACP Virtual Adaptors

When we bundle network links with LACP, each host creates a virtual adaptor that represents the bundle. For example, on a Cisco switch we can create an interface called portchannel 1, that represents the two interfaces fastethernet0/1 and fastethernet0/2.

In this case, instead of making changes to or examining the configurations of the physical interfaces we can instead work with portchannel 1. Of course you can work with the physical interfaces, but you must take care to make sure all parameters match on all physical interfaces in the bundle.

LACP Load-balancing Flows

LACP is flow-aware, and it can be configured to load-balance based on MAC address or IP address; the default in Cisco switches is to load-balance based on MAC address.

Load-balancing only really works when the system is able to identify many unique flows; as each flow is established it is put on one of the bundled links and all subsequent traffic also follows that physical link.

Be aware, that load-balancing based on MAC address (the default behaviour) may not be what you want — if your traffic crosses a router the original source MAC address will be obscured.

In a routed environment (if you’re using VLANs) you will find that any traffic that crosses a routing boundary will have its source MAC address replaced by the router MAC address. This can make many hosts appear (to LACP) as if they’re coming from a single MAC address and will definitely skew the load-balancing calculations. A better approach is to use IP based LACP load-balancing, as each host will likely have a unique IP address.

A good rule of thumb is to use MAC address load-balancing if you’ve got a flat Layer 2 network. It is easier for the switch to identify MAC addresses and in a network like this, every flow should be coming from a unique MAC address.

Server Based Failover and Load Balancing

Some NIC manufacturers have provided software to accommodate NIC failover and load-balancing without using LACP. See HP’s document describing the options they offer.

These configurations do work, however they are complex (in terms of traffic flow) and are therefore harder to troubleshoot in the event of network problems. Use LACP where possible, and these server based methods where necessary.

Communications

As a network consultant, you could say my specialty is in communications.

But it isn’t just protocols and passing bits; it might seem obvious to you but network consultants actually have to communicate to other humans.

So what do you do if you have problems communicating? I can’t claim my systems will work for everyone, but here’s what I’ve got. Maybe it will work for you!

Time Management

  • you are late for a meeting
  • you are very late for a meeting

Everyone knows that bad stuff happens. Maybe your car broke down, or there is a traffic jam. A quick fix is to send your customer an email, or a text message or even call them — and make an accurate estimation as to how late you will be, or even reschedule the appointment.

Create a time buffer to accommodate bad stuff happening. If you’re meeting a client down the street for coffee, and it is a 5 minute walk give yourself ten minutes. It is always better to meet your clients calm and relaxed, rather than out of breath and sweaty because you ran the whole way.

So many misunderstandings can be avoided if you set expectations ahead of time. If you’re driving 4 hours to a client site in the mountains and there is only minimal cellular service between them, give yourself 6 hours and make sure your client understands your situation.

Task Management

If you’re anything like me, you have hundreds of things on the go at any moment. I’m constantly forgetting to do things, but I try really hard to get everything done. It is not easy.

  • you forget to send emails describing what is completed
  • you forget to send emails describing what remains to be done

Depending on the client, I have different approaches. Sometimes I share a google spreadsheet with a client that has a list of tasks and their latest status. Other clients want more detail, or a higher frequency of updates — for these clients I send out daily/weekly/whatever updates with a list of tasks and their latest status.

The thing to take away from this is that your clients need to know what is going on. It isn’t hard to give this to them.

  • you forget to complete critical tasks

I don’t do this much anymore now that I keep track of my tasks, but you can imagine the conversations I would have to sit through, and the holes I had to dig myself out of.

  • you complete non-critical tasks before critical tasks

I recommend that you let your client determine the priority of tasks. You can always give them advice, but just because it is easy to knock 100 little items on a task list before tackling the 1 big one doesn’t mean that is what your client wants.

Conclusion

I guess the only other thing I can suggest is honesty, plain and simple. If you mess something up, be the first person to put up your hand and say “I forgot to do that”, and then fix it.

This is almost an exercise is how to be a good person, but you don’t have to be honest all the time, but if you’re honest about project updates and if you’re honest about the successes and failures that you had then your clients will love you for it. We’re all human, and we all make mistakes — and usually mistakes are cheaper to fix if they haven’t been brushed under the carpet for a long time.

High Availability

To provide highly available networks, we need two things; standby hardware, and good network design principles. No amount of duplicate hardware will help you if the network doesn’t recover from a failure, and you can’t get access to the wiring closet to bring up a cold-spare; just the same good network design cannot account for hardware failures or accidental cable disconnections, unless you have duplicate hardware waiting in the wings for just such an occasion.

What we look for with good network design principles is networks that automatically detect problems, and automatically recover from these faults as rapidly as possible. Most of us consider human intervention to mitigate a fault as inelegant; although for now us humans are still required to actually fix these faults, even though our networks can work around them.

This series of articles is going to look at High Availability in five ways:

  • Local switching
    NIC bundling, Spanning Tree Protocol and LAN Best Practices
  • Local routing
  • Campus (between local buildings)
  • Wide (between cities)
  • Internet

Each of these approaches is related to the others, but for the most part can be deployed without consideration for the other.

Before we get going, I’ll go over some key terms that are going to figure highly in these articles.

Failure Mode

The failure mode is the state to which the system (that can include firewalls, routers, switches, servers and client hosts) falls back to in the event of a network problem. This is probably the most important part of how we design networks, because we have to consider the state of things in the worst-case-scenario.

It might be acceptable to your client if the network is still usable, only with reduced performance; or your client may require 100% performance 100% of the time; or your client may not want to pay any extra money at all, in which case you can define the processes for manual intervention to resolve network failures.

Over-subscription

The idea behind over-subscription is that not everyone will be using a resource at the same time. Instead of just adding up the requirements of every user and building a network from that, a network engineer can assume some fluctuations in demand and design a system that has meets the requirements of the user-base as a whole.

Cold Spare

A cold spare is a duplicate device that is not powered up and sits on a shelf. An administrator must install the device, and possibly configure it to replace a failed unit. As long as the spare device is appropriately chosen, it may act as a spare for many other devices.

Warm Spare

A warm spare is a device that is powered up and configured, and while not currently in use it is ready to come into use as soon as a failure mode is detected. This should not require human intervention, although performance may not be optimal so an administrator would have to review the fault and repair it.

Hot Spare

A hot spare is a device that is powered up, configured, and currently in use. An optimal design will see the hot-spare not exceed 50% capacity, so in a failure mode the device will only be asked to handle its maximum load. This should not require human intervention, although performance may not be optimal so an administrator would have to review the fault and repair it.

Redundant Hardware

I don’t like the word redundant in this context, because it has a negative connotation and it isn’t exactly appropriate.

One dictionary definition of redundant is “Exceeding what is natural or necessary”, so you could say that redundant hardware is more hardware than is necessary at a minimum for the network to run. That said, the bare minimum probably does not meet the availability needs of your client, so any extra hardware is not actually redundant, but necessary.

The phrase “redundant hardware” isn’t appropriate in some cases, because often we design duplicated, fault-tolerant hardware in which both devices are in use all the time — so neither device in that case is redundant.

So what is more appropriate? I’ve been using the phrase “duplicate hardware”, and “fault-tolerant network designs” but really, there isn’t any other suitable word. Until someone coins a phrase that makes more sense, we can continue with this one — but make sure your clients understand what they’re buying into.

Balancing work with regular life

One of the challenges of being an independent consultant is that I have to manage my own time. To some people this sounds like it would be a dream come true, and I have to admit I sure do like it — but it is more work than it sounds.

I am personally responsible for arranging my own day. I work from my home office so nobody will notice if I’m still in my pajamas by noon, or indeed if I decide to ride my bike after 1pm. There’s no concept of surreptitiously leaving the office under some pretense; I don’t need to use pretenses because I define the hours I want to work.

All this power causes a problem — there is too much to do. Home videos to edit, tax and business paper-work, ordering parts for an aging motorcycle, and if I’m lucky some billable hours in there. Sometimes I even wash dishes, vacuum the apartment or do laundry. Why the heck am I doing laundry when I’m supposed to be working?

The flexibility in hours is a fantastic gift, but if I abuse it I won’t make any money in the week and I might as well be in Tofino taking surf lessons, or snowboarding while there’s nobody else on the mountain, or even just walking around town enjoying the sun. Sometimes I do just that, and it is fantastic!

I have a some systems to help me deal with this general malaise that comes from working at home, and managing my own time:

  • I plan my week ahead of time, and put project slots in my calendar. As it is largely for my own planning it is very flexible, so if something comes up I can always move something else around.
  • I make appointments with clients. Some of my clients are flexible with deadlines on the projects I work on, but I always let them know when I’m going to be in their office — that way I’m compelled to show up on time.
  • I sometimes sit in a coffee shop and work for a couple hours — but only if it isn’t dealing with actual papers (maybe network drawings) or lots of phone conferencing. Just getting out of the house, and seeing some humans during the day helps clear my mind.
  • Lastly, I remind myself that this is in fact my dream job. This is what I’ve been striving to achieve for my entire career, and I should make the best of it.

Paul

Transformations of Networking — Part 5 — Wireless

  1. Transformations of Networking — Part 1
  2. Transformations of Networking — Part 2
  3. Transformations of Networking — Part 3
  4. Transformations of Networking — Part 4
  5. Transformations of Networking — Part 5 (this article)

One thing that may seem missing from this chronology of networking is wireless. Wireless networking has been around for many years in the form of WAN connectivity (which I have completely side-stepped for this series), but it was in 1997 that the first 802.11 IEEE draft was ratified.

Lucky for you, the Wikipedia 802.11 page has a solid description of the growth of the WLAN protocols so I don’t feel compelled to repeat it. What I would like to talk about is how the designs of wireless networks have changed over the years.

Wireless Routers

The first time wireless really came into the public consciousness was with wireless routers. These devices are commonly used as a single solution for a firewall, Internet router (usually providing NAT for IP masquerade) and Layer 2 Access Points.

These devices allowed administrators to create one-off wireless LANs. These wireless routers offered a variety of security considerations, and some of the more advanced systems allowed administrators to control where wireless clients can go — allowed the administrator to create a WLAN for visitors, without opening up the LAN to a non-employee.

Autonomous Layer 2 APs

For the most part, APs have been autonomous. This term really only makes sense when you consider the alternative, which is in the next section. Suffice it to say that what we consider autonomous is an Access Point that is managed by an administrator directly, and that applies security policies locally. This approach works very well for small-scale operations, and we’re going to see these types of devices for a long time.

That said, administration of a large number of autonomous APs can become tedious. The largest AP deployment I’ve seen is 100 APs (for a school division with 10 sites in BC), so you can imagine what Universities and Colleges are planning right now. I don’t even want to think of the work involved in updating a WEP or WPA key for 100 APs, or 500, or 1000? How about adding an SSID? Or managing a security policy? My favorite job is doing meaningful work, I don’t get any satisfaction from tedious work, and thankfully neither does Cisco.

Cisco’s initial solution to this problem was their CiscoWorks WLSE product, which provided a central location to manage many Autonomous APs. The software would upload configuration changes to remote APs as required.

Unfortunately WLSE was fairly complex to setup so it wasn’t largely adopted; and now it is to be relegated to a foot-note as the next evolution of wireless networking eclipses this problem.

Controller Based Layer 2 APs

Cisco. What a great company — what they don’t invent, they buy. In 2001, Airespace was founded and they developed a controller based system for managing APs. This wasn’t WLSE in any way.

The so-called light-weight APs are essentially stripped of most of their responsibility, to two essential tasks: advertising the required SSIDs, and tunneling the user traffic to a central controller. This tunneling approach (over a protocol called LWAPP) meant that the local infrastructure became meaningless — the APs could sit on the LAN and advertise an SSID that had no place on the LAN (like a contractor WLAN, or a student WLAN) without compromising security.

There are lots of other really good reasons for to centralize in this way. Cisco plots out a whole bunch with their white paper: The Benefits of Centralization in Wireless LANs

I’ve done two big controller based deployments, and I can tell you that we can get a big bunch of APs online very, very quickly this way.

I’ll take this opportunity to note that Cisco isn’t the only player in the wireless centralization space. The IETF is working on a standard called CAPWAP that is based on LWAPP, but isn’t proprietary to Cisco. A quick look shows that there are participating members from some big networking companies (but not Cisco); so who knows where this will lead. Sometimes in networking the IETF defined standards sit unused, while a defacto standard is widely distributed.

Wireless Mesh Topology

Just a quick note on what is possible here. I haven’t done much with wireless mesh, but I have seen it in use. The primary purpose of a wireless mesh is to extend a wireless network without using any wires. Seem straight-forward enough? There is a lot going on under the hood, and like with any wireless deployment you’re going to need a good site-survey to make sure this happens.

A wireless mesh topology is essentially a way of back-hauling data wirelessly, between APs, where each AP can either be advertising SSIDs for users, or be attached physically to the LAN for the final leg to network services or other network clients.

Naturally Cisco has a white-paper on this: Cisco Enterprise Wireless Mesh Solution Overview

Radio Considerations

Site Survey

Even for a small deployment, a wireless site-survey is important. For a large deployment, it is critical to know the weak-spots so they can be address with either more wireless power, or with a wireless AP to patch the hole. At least, your client should understand the limitations of the design you’ve specified.

Channel Overlap

We see wireless networks overlapping all the time. In a controlled environment with lots of APs providing coverage to an area it is important to choose channels with minimal overlap. Overlap happens because even though there appear to be 11 (14 outside of North America) available channels, there is a significant amount of transmission outside of this range for each channel. As a result, there are only non-overlapping channels in the 2.4GHz (802.11b/g) range.

Cisco has a concise document that describes handling channel-overlap Channel Deployment Issues for 2.4-GHz 802.11 WLANs

Here is a link from Wikipedia describing the available WLAN channels for each range: List of WLAN channels

You can see that while 802.11b/g has a limited number of available channels (with consequences for signal overlap), 802.11a on the other hand has many more available channels. In a wirelessly saturated environment, you might have better luck with this channel range, even taking into account its shorter range and problematic structural penetration rate (it doesn’t go through walls as well).

One last note is for 802.11n — which while it is still being worked on by the IEEE has already reached some penetration into the market. 802.11n should be very useful, as it will use both frequences ranges (2.4GHz and 5GHz) — minimizing the problems with channel overlap.

Wireless Best Practices

One of the most important things to consider in WLANs is the security. Wireless networks penetrate walls, and spread outside of your business, and access to you network can be gained without a physical connection to the hardware in the LAN. These are indeed, the benefits of a WLAN, but without careful management they can also be its undoing.

If you’re deploying a wireless network (even a small one) it pays to do things right. Don’t rely on your neighbors good nature to not tap into your systems, for even if your neighbors are honest, their employees or contractors may not be, and certainly you can’t count on someone driving around looking for available networks. See Wikipedia’s Wardriving.

Cisco:
Five Steps to Securing Your Wireless LAN and Preventing Wireless Threats

Microsoft:
Wireless Deployment Recommendations and Best Practices

Summary and Implications

So how does all of this affect your network? It depends on your client.

Lightweight controller based solutions have a higher per-AP cost, so you have to be able to justify this addition cost. It doesn’t usually make sense to do a controller based AP deployment for less than 6 APs, but Cisco has released a small business range of APs with only 802.11g radios (so no a,b or n) that can be autonomous or controller based.

Cisco 521 Wireless Express Access Point

Cisco 526 Wireless Express Mobility Controller

Whether or not you go for a controller based architecture, or autonomous APs the key element to keep in mind is WLAN security.

Notes about writing a technical blog

So here I am, 5 months after kicking off this blog. I have to say that I have a renewed respect for the people who are able to consistently generate useful content day after day. Wow.

Writing the 5 part series Transformations of Networking was a lot more work than I expected. The idea came to me when I considered how much the hub, the switch and the layer3 switch changed networking — and how much these technologies still resemble each other. But I constantly worried about the data that I left out, and the feel of the series at the end changed, as the topics went from networking technology that is long gone to stuff that I’m working on right now. So instead of a history of networking, it blended into today’s networking and these things that we’re going to continue doing into the future.

Another challenge is that I’m not seeking to be funny, or sarcastic, or scathing. I’m just trying to provide something that will be useful to a new network engineer, and maybe even my peers. So that meant that I had to seek out and find the information that was useful, without rehashing what is already available — this is why you’ll see many references to sites that I consider authoritative, and sites that should be unbiased enough to give a fair understanding of the networking world.

Nonetheless, there are lots of things to write about so I’m going to go on, and using these articles as a base I can go forward and write about fun stuff like high-availability, IP telephony, network security and on and on. My plan is to release an article every week (technical one week and something more personal the next) on Thursday mornings. So keep watching and waiting.

Lastly, I sure would like some commentary, so this doesn’t seem so much like a one-way conversation — and I can tell from Google Analytics that you’re visiting. The site should be able to handle your questions so if you think I’ve gone wrong somewhere, or if you need clarification on something then by all means let me know.

Paul

Transformations of Networking — Part 4

  1. Transformations of Networking — Part 1
  2. Transformations of Networking — Part 2
  3. Transformations of Networking — Part 3
  4. Transformations of Networking — Part 4 (this article)
  5. Transformations of Networking — Part 5

The switch, along with the protocols STP (802.1D), LACP (802.3ad), and VLAN support (802.1Q) changed the face of networking. Network administrators were able to design flexible, fault-tolerant networks that could repair network faults automatically. All of this snowballed into the year 2000 (and beyond) as end users became more and more connected to the internet, the economy started to depend more and more on computer systems, and indirectly the network that supported those computer systems.

VLANs — 802.1Q

VLANs allowed companies to split up large networks into smaller, more manageable networks. A switch configured with more than one VLAN essentially creates virtual networks within itself.

Some companies would create VLANs based on physical location; for example a company whose offices cover a large but localized area, like a University or a mining company might choose to create a VLAN for each building, or each floor of a building. Other companies would seek to create some logical separation between political groups in the organization, with Human Resources having their own network and servers, separate from the Geology network and servers.

The fascinating aspect of VLANs, was that it was possible to have Geologists, and Human Resources workers in the same building/floor/room, connected to the same switch but on totally separated networks.

Problems and Limitations with VLANs

Security

Some organizations looked to VLAN technology to reduce the amount of network hardware in the datacenter.

For example, this organization had a core switch, a DMZ switch and an Internet switch. Each switch was only partially utilized in terms of ports and internal bandwidth, so the temptation was to use a single switch, with a VLAN for each purpose and therefore save on costs, as below.

The problem here is below the surface, and while I can’t go into much detail here, suffice it to say that VLANs alone cannot provide perfect, secure separation of data between networks. There are methods to configure secure networks in this way however there is always a risk of data leakage due to an administrative error.

For those interested, the proper security method uses VLAN ACLs, here defined by Cisco.

Routing

The second problem with VLANs, is that these virtual networks cannot directly communicate with each other — traffic between VLANs must be routed by a device(s) with access to each VLANs.

Routers, firewalls and servers served this purpose during the early days of VLANs, however as network usage grew and server consolidation became a buzzword in IT, network administrators found themselves up against the limits of the routing hardware.

A router, firewall or server can route only in software. These days these routers can be quite fast, but in networks where servers are pushing the limits of gigabit interfaces and quite possibly beyond it becomes very expensive to process IP routing in software.

The solution came with the Layer 3 Switch.

Layer 3 Switching

The Layer 3 switch is essentially a router whose routing functions have been encoded into hardware — so IP routing can be handled on ASICs (Application Specific Integrated Circuit) very quickly, compared to routers, firewalls or servers whose routing functions still happen in software.

This comparison between L3 switching and traditional software routing shows an immense difference in routing speed; a software router might be able to route around 50-100mbps (potentially higher, depending on the hardware), whereas the Layer 3 switch is able to route at line rate (1gbps per-port).

These fast routing speeds are obtained by offloading the IP routing tasks to an ASIC. There are limitations with the ASIC, in particular that it is not a general purpose CPU — for example a 3750 is not able to perform NAT/PAT at any reasonable rate, and there are other limitations. But if you’re looking strictly for high-speed routing between VLANs, it is hard to beat it. I’ve seen configurations where network administrators use a 3750 for inter-VLAN routing, and a Cisco 801 router for NAT/PAT and firewall purposes.

Other switches such as the Cisco 6500 series is capable of both high speed L3 switching and processing complex, general routing calculations such as NAT/PAT very quickly.

Summary and Implications

Layer 3 switching allows network administrators to create complex, fault-tolerant, high-speed networks. A multi-VLAN environment allows networks administrators to define the network in a comprehensible way, and provides the framework to deliver different services on the network, including data, voice, wireless, video and many more.

Above is a simple example of a campus network configuration, however you can imagine that each edge switch is not limited to a single VLAN. In fact, when I build networks I always tell my clients to leave room for growth in VLANs (and therefore, IP allocation) — even if they’re not using IP telephony or wireless today doesn’t mean it won’t come tomorrow.

Transformations of Networking — Part 3

  1. Transformations of Networking — Part 1
  2. Transformations of Networking — Part 2
  3. Transformations of Networking — Part 3 (this article)
  4. Transformations of Networking — Part 4
  5. Transformations of Networking — Part 5

The Switch

According to Wikipedia, the first switch was designed by Kalpana in 1989. By 1994 the company was purchased by Cisco Systems, and we all know who Cisco is now.

Network switching went through a few evolutions in the ten years from 1990 to 2000; additional functionality to STP (spanning-tree), link aggregation and VLANs are the ones that I’ll touch on. These are largely incremental technical innovations that are useful on their own, but not groundbreaking. What is really useful here is that putting these protocols together allowed network administrators to built flexible, fault-tolerant, stable networks.

The problem with broadcast traffic

Even though switches had reduced the network overhead by sending network traffic only down those ports which require it, there was another type of traffic getting ready to rear its ugly head. As with many things, a network configuration that works well on a small scale often falls apart as networks get larger and larger.

A broadcast is a packet designed to be sent to every host on the network. A unicast packet is designed to be sent to one host only.

Broadcast traffic is by definition destined for all hosts on a network, so when a switch receives a broadcast packet, it floods every other port with these packets and every host on the network must examine the incoming packet, and determine if it wishes to act upon it, or throw the packet away.

The problem here is with scalability. With less that 200 hosts on a network, the broadcasts are usually not a problem, but as networks get larger and larger the sheer volume of the broadcast traffic can actually reduce total network bandwidth as the switch is filling server uplinks, router and firewall uplinks, switch uplinks and even client PC interfaces. In addition, processing broadcast traffic requires CPU attention by client PCs and servers, and this can place a high (and hard to troubleshoot) load on these devices.

Common but Inefficient Approaches to Scaling IP Networks

When a site runs out of IP addresses, sometimes a network administrator will simply create a larger IP range:

  • Original: 10.10.10.0 255.255.255.0 yeilds 254 usable addresses
  • Larger: 10.10.10.0 255.255.254.0 yeilds 510 usable addresses
  • Largest: 10.10.10.0 255.255.252.0 yeilds 1022 usable addresses

If a site actually filled up one of these larger IP spaces, there could be a huge amount of broadcast traffic bumping around the network.

Sometimes network administrators will just overlay IP ranges on top of each other within the same LAN:

  • Original: 10.10.10.0 255.255.255.0 yeilds 254 usable addresses
  • Secondary: 192.168.10.0 255.255.255.0 yeilds 254 usable addresses

The problem with both cases is that every host on the network still receives the broadcast traffic, and this traffic is forwarded across network trunks, to servers and to end clients.

A (much) Better Approach to Scaling IP Networks

It is much better to create VLANs, as this keeps the broadcasts trapped within each VLAN and reduces the overall load on the network. VLANs always require a router to transfer traffic between them; this can sometimes be done on the switch itself (if it is a Layer 3 switch), or through a router or server.

VLANs — 802.1Q

A VLAN, or Virtual LAN on a switch keeps hosts in one VLAN from seeing data traffic (generally broadcast traffic) from other VLANs. This provided a huge advantage in security, and in reducing the overall effect of broadcast traffic. At about this time, networks consisting of hubs and switches alone were experiencing slowdowns due to large broadcast zones.

Once switches were able to identify a port as being part of a particular VLAN, network administrators wanted to be able to send a VLAN to another switch which would have a corresponding port on that same VLAN.

To do this, the 802.1Q protocol was devised, which adds a tag (a small header) to traffic (identifying the VLAN) between switches when more than one VLAN is put onto a single physical interface. Thus, the concept of VLAN tagging was born and network administrators were able to create VLAN trunks, which were simply regular interfaces configured to tag traffic from different VLANs on them.

See Cisco’s guide on configuring 802.1Q trunks.

The ability to send VLANs across switches, sometimes across campuses gave network administrators the flexibility to create VLANs based on whatever distinction fit the business best — not only by physical location, but other organizational format such as departments or security zones.

Spanning Tree Protocol — 802.1D

For the first time, STP allowed network adminstrators to design networks with fault-tolerance in them.

In the diagram above, with three switches (potentially even in the same wiring closet, but not necessarily) it is possible to provide some level of fault-tolerance to client. The diagram describes a situation where one of the links has failed; STP detects the failed link, and opens up a previously blocked port to re-establish communication.

This scenario can happen if a cable actually breaks, a fiber transceiver fails, or someone accidentally unplugs the wrong port. It happens.

This diagram describes a total switch failure; for example if the root switch fails in this scenario, STP will detect the lost connection and open that previously blocked port so at least the two secondary switches will maintain communication. A good network administrator could anticipate likely failure modes, and place a backup server on one of these other switches to ensure that network clients always have services. I’ll talk more in-depth on this in another article.

Problems and Limitations of STP

Convergence Time

First, if a switch detects a network change (neighboring switch failure, cable failure) it goes into ‘learning mode’ where the switch will not pass traffic on any ports. All the other switches in the LAN will also go into learning mode, forcing a total network failure — although temporary it is very inconvenient as STP convergence can take over a minute.

To combat this slow convergence time, the IEEE created RSTP (Rapid STP) which can (when properly configured) converge in under a second.

Learning mode and end-user inconvenience

Secondly, whenever a port comes online STP puts that individual port into learning mode. This can sometimes cause end-clients to fail at DHCP, or at least take a very long time (over a minute). The real problem is that this test is largely inappropriate — what are the chances of detecting a loop at an end user’s port?

To ensure that end-users are not inconvenienced by what is largely an inappropriate test, Cisco devised a solution. portfast configured on an interface tells STP to immediately place that port into forwarding mode — effectively bypassing the learning phase of STP. However, as it is still possible (and it has happened) for an end user to create a loop, Cisco provided a second configuration parameter to go along with this: bpduguard. Bpduguard is a very simple program; it simply watches for BDPU frames and if it sees one, it shuts down the port immediately. That way if an end user inadvertently created a loop, it doesn’t affect the rest of the network.

For more info on portfast and bpduguard, see the Cisco documentation here.

Link Aggregation — 802.3ad

The Wikipedia Link Aggregation page

Link aggregation allows network administrators to bundle multiple network connections into a single, logical, virtual network interface. This approach can give network trunks with greater bandwidth, and higher-availability as there are more links to accidently trip over or cut with scissors. It happens.

Link aggregation works by identifying a traffic flow, most commonly done by IP address. Each device will remember which physical interface a particular flow is assigned to, and will send traffic down that interface. In this way, traffic between two hosts doesn’t benefit from bandwidth increases, however many hosts to a server, or many hosts to many hosts will approach 50/50 load-sharing on the virtual link.

See Cisco’s guide on configuring LACP between switches.

LACP isn’t only for network hardware, as it can provide the same fault-tolerance and bandwidth scaling for servers. This can be very useful for virtualized servers, as they often have high bandwidth requirements and the tolerance for NIC failure is very low. In this case, a network administrator works with the local system adminsitrator to configure the server using the server software, and the local switch is configured to match the parameters.

This diagram shows a server with two physical interfaces bundled into a single logical interface. This configuration allows the server and network administrator the ability to double bandwidth to the server, provide seamless NIC configuration changes (replacing server NICs, changing cabling infrastructure), and also high-availability in the event of a NIC failure, or a cable failure.

Above is a diagram that shows an advanced configuration which I won’t get into right now, suffice it to say that the Cisco 3750 series switch allows for more fault-tolerance built into the network than a standard L2 or L3 switch.

Summary

All of this technology is still in use today, and the configurations are still valid for many sites. While this period in networking saw the first Layer 3 switches, these devices didn’t come into common use until after the year 2000 — the next stop in the transformation of networking.

Transformations of Networking — Part 2

  1. Transformations of Networking — Part 1
  2. Transformations of Networking — Part 2 (this article)
  3. Transformations of Networking — Part 3
  4. Transformations of Networking — Part 4
  5. Transformations of Networking — Part 5

The Hub and 10baseT — 1990

The Wikipedia Network Hub Page

The Wikipedia 10baseT Page

10baseT and the hub freed networks from the limitations of coaxial cabling by creating a single location for all cables to come to — allowing network designers to create a star network design.

But for as much flexibility and growth as the hub made possible, it created other problems as network designs grew to the limits of the technology.

Because the hub was so easy to deploy, the temptation was to deploy these devices all over the network — and use the hardware to extend the network as required. Fundamentally this works, but there is more going on under the surface.

Hubs operate by receiving an electrical signal on a port, and internally amplify and repeat this signal on all the other ports of the device. This means that a signal created by a computer must be processed by the internal amplifier of a hub and re-emitted on the other ports of that hub; if many hubs were chained (often called daisy-chained) together linearly this had the effect of increasing the RTT (round-trip-time) between devices further than is permitted by Ethernet’s CSMA/CD.

You may recall that CSMA/CD is a protocol that detects signal collisions, whereby a host (having detected a collision) will wait a randomized amount of time before transmitting again. These timers are designed for a particular RTT, and if a network consisted of many daisy-chained hubs the time for all hosts to receive a signal transmission could in many cases exceed the timers of CSMA/CD — causing excessive collisions and therefore a slowdown of the network for all hosts. Because of this, a hub-based network is limited to two or three hubs chained together.

If that wasn’t bad enough, the flexibility of this new technology meant that sometimes network administrators would accidently (or maybe intentionally) create a loop in the Ethernet network by plugging two switches together:

A Layer 2 loop (like above) happens because a hub is designed to receive, and re-broadcast all signals it receives. When a Layer 2 loop is created, the effect is immediate and devastating — the entire network will slow to a crawl, the CPU resources of the computers attached are significantly affected as each host of the network must process received signals even if they are not ‘interesting’ to the host.

Despite all of these flaws many people use hubs in their homes, and their simple design means that they’re still useful today for network analysis.

The Network Switch — 1989

Wikipedia Network Switch page

A network switch is in many ways an intelligent hub. The switch tracks the unique MAC address (a 32 bit hexadecimal number associated with every host) in an internal table, and armed with this data is able to achieve the great feat of sending data out only those ports for which that data is destined.

The Wikipedia page on the Ethernet Frame

Confused? I was at first too. In each Ethernet frame, a header precedes the data payload. This header contains (among other things) the source MAC address, and the destination MAC address of the frame. In the early days of Ethernet when every host received every frame put on the network, this header allowed a host to determine if a data payload was interesting; a host would look at the header and if it recognized its MAC address in the destination field the host would further process the data, otherwise it could simply discard the rest of the frame and wait for another.

The switch took advantage of the data in the frame header.

  1. From the source MAC address in each frame, a switch is able to build a historical table of which MAC addresses were associated with particular switch-ports.
  2. However, when a switch is initially turned on this table is empty — so the correct behavior of an Ethernet switch is to operate as a hub for those MAC addresses that it does not know.
  3. Lastly, sometimes hosts come and go, or they can even change switch-ports, so the switch has a timer for each MAC-table entry (this timer is usually 10 minutes — but beware as it is sometimes configurable as some other amount)

This is how the switch brought about the second major transition of networking, which was the isolation of the collision domain. Isn’t that neat? Now instead of hosts being bombarded with traffic that they don’t even care about, the network is finally quiet. Almost too quiet.

Full-duplex Ethernet — 802.3x

the Wikipedia full-duplex page

Software engineers agreed that the network was too quiet, so they devised a way to get more data out of the network — and their solution was full-duplex Ethernet.

The Hub used CSMA/CD because every host on the network was subjected to a veritable cacophony of network traffic. Once hosts were directly connected to switches there was no risk of a collision on the local segment so Ethernet hosts could send and receive data at the same time — bringing us to Full Duplex. In this example, the local segment refers to the physical wire between the host and the switch.

It is for this reason that networks consider a full duplex connection to be double the bandwidth of a half-duplex connection. For example, if a server has a 10baseT, full-duplex network connection it is considered to have 20mbps of bandwidth; 10mbps in each direction.

Now you might say, what if Host A and Host B both transmitted to Host C at the same time, wouldn’t there then be a collision? The engineers who created 802.3x had the same idea, so they included a flow-control protocol into the full-duplex specifications.

The Wikipedia flow-control page

Flow-control essentially allows a host to send a PAUSE frame back to the sender, requesting the sender to stop sending traffic for a while to allow the host time to process the frames.

Layer 2 Loops and Spanning Tree Protocol

the Wikipedia 802.1D page

Remember the horror of the Layer 2 loop? Immediate, devastating network failures — that was bad. But thankfully, some clever network engineer devised a protocol that was able to detect, and close these loops before they got out of hand.

And so STP (Spanning Tree Protocol) was born. Roughly speaking, every switch on the network would consider a newly-activated port with extreme paranoia and would not accept new traffic from that port until it was sure that someone had not created a loop.

STP was able to determine if a loop had been created, by sending out specially crafted L2 frames called BPDUs (Bridge Protocol Data Unit). If a BPDU went through a loop and was received by the same switch that sent it out, the switch would consider that port to be looped and would shut it down.

A single switch was elected as a STP root, which controlled the behavior of all other switches in the STP domain. If a network change was detected, the switch would send out a specially crafted L2 frame called a TCN (Topology Change Notification). When the STP root received this frame, it would send out a BPDU with a special flag set that would instruct all other switches to re-examine any ports which they had previously assumed were loops.

An interesting side-effect of this protocol, is that STP allowed engineers to build fault-tolerance into switching networks. An engineer could intentionally create a loop and let STP resolve the issue — and feel safe knowing that if a network interface or cable was damaged or removed, STP would detect this change and recalculate which ports to use.

In this way STP can not only prevent destructive Layer 2 loops, but it can also recover from network faults automatically. This automation meant that companies could continue to operate in a failure scenario — where otherwise some users would have been offline.

Summary

Hubs changed the face of computer networking, but switches brought intelligence and fault-tolerance to the network. For the first time this allowed companies to invest in self-healing networks that allowed their employees to continue working even when part of the network had failed.