Cisco networking & eduroam: Rate Limiting Using Microflow Policing

This is my final post on the interesting technical aspects of the new networking infrastructure that support the eduroam service around the university.

This post covers the finer technical details of how we currently rate limit client devices to 8Mbps download/upload on eduroam – using Microflow Policing on the Cisco 4500-X switches. If readers want to know the reasoning behind why we rate limit at all, then I invite you to read my colleague Rob’s blog post.

Some History

You may recall from my initial blog post that the backend infrastructure that previously supported the eduroam service (and continues to support the OWL service) utilised a dedicated NetEnforcer appliance. This appliance actually did more than simply throttling user connections. In addition, it also performed Deep Packet Inspection (DPI) and applied different policies to certain types of traffic, such as more aggressively throttling P2P traffic for instance.

We had just one of these appliances and this sat inline between the original internal Cisco 3560 switches and the primary Linux firewall host. The appliance utilised an incorporated switch and additional bypass unit. The former providing the required interfaces to connect to the infrastructure, and the latter providing fail-open connectivity in the event of failure.

So you may be asking why we didn’t incorporate the original NetEnforcer hardware into our design? Or why we didn’t acquire upgraded NetEnforcer hardware (or even something from another vendor) to serve our needs moving forward?

Well, the answer to the first question is that the current appliance has reached and gone beyond its end-of-life from the vendor (back in 2013). It has also proved to be prohibitively expensive to purchase and licence during its lifetime, not to mention it’s another ‘bump in the wire’ we would have to manage moving forward.

The answer to the second question is for all the reasons above – plus our default assumption at this point was that a newer 10-gigabit capable appliance from any vendor would only be more expensive, especially if we were to continue to want DPI capabilites. This certainly would not have fitted into our fairly modest budget. Plus with further consideration, we would likely have had to buy two appliances to ensure a truly resilient and reliable service.

In summary, we were searching for an easier way to achieve what we wanted.

So what are we limiting exactly?

At this point, we decided to take a step back and evaluate exactly what bandwidth management we wanted our potential solution to provide. We decided on a goal, which at a high-level, seemed fairly straightforward. That goal was to limit each client device to 8Mbps in both directions. We quickly ruled out the possibility to perform any cleverness with DPI – this would have involved the purchase of additional hardware after all.

To expand on this somewhat and really nail things down, our new solution would have to meet the following requirements:

Be capable of identifying, and distinguishing between individual clients connected to the eduroam service;
Apply rate-limiting to each client’s overall connection to the network – thus providing a fair and equal service for all that is not based on individual connections or flows, but is based on the sum of each client’s connection;
Be implementable using only the hardware/software already procured for the eduroam upgrade;
Be implementable without impacting the performance of the infrastructure or the client experience;
Be able to scale to the numbers of clients seen today on the service and beyond.

It was these requirements that would lead us to Microflow policing as our preferred method. It might interest readers to note that we also seriously considered using queuing methods on the Linux hosts to achieve this. My colleague Christopher will be writing a blog post on this topic in due course. For now, know that this was a difficult decision that we ultimately made because we had more faith in the scalability of Microflow policing.

QoS Policing vs shaping

Many readers are likely to have heard of the term policing in the context of traffic management. This is used extensively on many service provider networks as an example and the general idea is to limit incoming traffic on an interface, to a certain bandwidth that is less than its capable line rate. Policing can only generally be performed on traffic as it ingresses an interface. It is therefore fundamentally different to another traffic management feature called shaping which is actually concerned with applying queuing methods to rate limit outgoing traffic that egresses interfaces. The terms are often confused and inter-changed so I thought I would attempt to make that distinction as clear as possible before going any further.

The type of policer probably most common (and what we are using in our setup) is often referred to as a one rate, two-colour policer. What this means is that we define a conforming (or allowed) traffic rate in bits per second (bps) called the Committed Information Rate (CIR) and anything over this is considered to have exceeded the CIR. You can then decide on actions for traffic that conforms to, and exceeds your CIR in your policing policy. There are other flavours of policers such as two rate, three colour which allow you to specify a Peak Information Rate (PIR) too and introduces a third violate action. This type of policer could be used to allow traffic to occasionally burst over the CIR within the defined PIR if that were desired, however in our setup it wasn’t really necessary.

Enter Microflow policing

In our case, we didn’t simply need to police all traffic ingressing from the eduroam networks around the university, or vice-versa, from the outside world. We wanted to be far more granular than that as per the requirements above. To enable us to do this, another feature was needed in conjunction to a standard QoS policer. This feature, called Microflow policing, makes use of Flexible Netflow on the Cisco 4500-X switches in conjunction with some configured class-maps and ACLs, to create a granular policy that applies to specific traffic as it enters the eduroam infrastructure from the university backbone and vice-versa, from the outside world (via our firewalls).

Flexible Netflow is a relatively new feature in Cisco’s portfolio that allows you to specify custom records that define exactly which fields within packets you’re interested in interrogating – which fits our purposes very nicely indeed!

Defining how we Identify & distinguish between eduroam clients

To fulfil our requirements above, we had to identify and distinguish our clients on the eduroam service. To do this required the following configuration:

flow record IPV4_SOURCES
 match ipv4 source address

flow record IPV4_DESTINATIONS
 match ipv4 destination address

ip access-list extended EDUROAM_DESTINATIONS
 permit ip any 10.16.0.0 0.15.255.255

ip access-list extended EDUROAM_SOURCES
 permit ip 10.16.0.0 0.15.255.255 any

OK some explanation will likely aid understanding here.

Firstly, the ‘flow record’ commands tell Flexible Netflow to set up two custom records – the ‘IPV4_SOURCES’ one as the name suggests, is set up to read the source address field in the IPv4 packet header and the ‘IPV4_DESTINATIONS’ one is conversely set up to read the destination address field in the IPv4 header.

Next, two extended ACLs are set up to specify the actual IPv4 addresses we’re looking for – traffic traversing the eduroam service! The ‘EDUROAM_SOURCES’ one specifies traffic sourced from within the eduroam client address range 10.16.0.0/12 destined for any address. The ‘EDUROAM_DESTINATIONS’ ACL specifies the exact opposite – specifically, traffic sourced from any address destined for clients within 10.16.0.0/12.

The eagle-eyed amongst you will have realised that I’ve specified the internal eduroam client address range here and not the public range. This is important going forward for two reasons:

We use NAT overload to translate the internal RFC 1918 space 10.16.0.0/12 into a much smaller /26 of publicly-routable space (IPv4 address space on the Internet is at a premium after all). Therefore it would be impossible to distinguish individual clients using the public range as one address within this range is likely to actually represent numerous clients. Therefore we have to apply our policies before applying NAT translation;
We are now limited (remembering that policing only works in the ingress direction) on which interfaces we can apply our Microflow policing policy to.

Classifying the traffic we’re interested in

So now we’ve specified our parameters for identifying and distinguishing our clients, it’s time to set up some class-maps to classify the traffic we want to manipulate. This is done in the generally accepted, standard Cisco class-based QoS manner. Like this:

class-map match-all MATCH-EDUROAM-DESTINATIONS
 match access-group name EDUROAM_DESTINATIONS
 match flow record IPV4_DESTINATIONS

class-map match-all MATCH-EDUROAM-SOURCES
 match access-group name EDUROAM_SOURCES
 match flow record IPV4_SOURCES

Note that I’ve given the class maps meaningful names that tie in with those that I gave to the ACLs defined above. Also note that I have used the match-all behaviour in the class-maps. So for traffic to match the policy, it has to match both the extended ACL and the flow record statement. In fact, traffic will always match the flow records, as all IPv4 packets have source and destination address headers! This is exactly why we need the ACLs too.

Defining our QoS policy

Now for the fun part! Let’s set up our policy-maps containing the policer statements. There’s nothing particularly fancy going on in this QoS policy configuration – remember the cleverness is really under the hood of our class-maps referencing our custom flow records and ACLs:

policy-map POLICE-EDUROAM-UPLOAD
 class MATCH-EDUROAM-SOURCES
 police cir 8000000
 conform-action transmit
 exceed-action drop

 policy-map POLICE-EDUROAM-DOWNLOAD
 class MATCH-EDUROAM-DESTINATIONS
 police cir 8000000
 conform-action transmit
 exceed-action drop

The policy maps are named differently – but are still meaningful to us. One policy is designed to affect download speeds, so it’s called ‘POLICE-EDUROAM-DOWNLOAD’ and the other is designed to affect upload speeds so is called ‘POLICE-EDUROAM-UPLOAD’.

Tying it all together

So let’s quickly tie this all together. Firstly, pay particular attention to which class-maps I’ve referenced in each policy map. The logic works like this:

The ‘POLICE-EDUROAM-UPLOAD’ policy map references the ‘MATCH-EDUROAM-SOURCES’ class-map, which in turn references the ‘EDUROAM-SOURCES’ ACL and ‘IPV4_SOURCES’ flow record, which in turn matches traffic sourced from clients within 10.16.0.0/12 – our eduroam clients;
The ‘POLICE-EDUROAM-DOWNLOAD’ policy map references the ‘MATCH-EDUROAM-DESTINATIONS’ class-map, which in turn references the ‘EDUROAM-DESTINATIONS’ ACL and ‘IPV4_DESTINATIONS’ flow record, which in turn matches traffic destined to clients within 10.16.0.0/12 – again, our eduroam clients.

Also note that the CIR has been specified as 8000000bps. The keen mathematicians amongst you will note that this is not actually 8Mbps, but it’s very close. I could have been even more specific and specified 7629395bps but I figured I would round the figures up to make our lives here in Networks a little easier! Also note that I have specified the conform and exceed actions to be transmit and drop respectively. Note that for this to work properly, the conform action must transmit the traffic and the exceed action must be defined or the policy simply won’t do anything useful. It is possible to configure the exceed action to re-mark packets to a lower Differentiated services code point (DSCP) value rather than to drop them if this better matched your own existing QoS policies and you were that way inclined. However, the drop action suits our requirements here.

Applying the policies to the interfaces

This all looks good, but we’re not done yet. The final step in the process was to apply the QoS policy-maps to the correct interfaces:

interface Port-channel10
 service-policy input POLICE-EDUROAM-DOWNLOAD

interface Port-channel11
 service-policy input POLICE-EDUROAM-DOWNLOAD
end
interface Port-channel50
 service-policy input POLICE-EDUROAM-UPLOAD

interface Port-channel51
 service-policy input POLICE-EDUROAM-UPLOAD

So that’s four interfaces in our topology. The first two are the portchannels connecting to the inside interfaces of our Linux firewalls and the others are the portchannels connecting to the university backbone routers. To aid in understanding, I’ve also depicted this on the diagram below:

Verification

To see this in action, and prove it works, you can always use the speedtest.net method which in fact I did during my initial testing, as I knew that this method would be the yardstick many of my colleagues around he university would be using to test their download and upload speeds when connected to the service.

I won’t bore you with screenshots from speedtest.net, I’m more interested in showing you the output from the 4500-X switches to see what’s actually happening. Here’s some show output from the production lin-router switches as of today:

lin-router#show policy-map interface po10
 Port-channel10
Service-policy input: POLICE-EDUROAM-DOWNLOAD
Class-map: MATCH-EDUROAM-DESTINATIONS (match-all)
 361805297845 packets
 Match: access-group name EDUROAM_DESTINATIONS
 Match: flow record IPV4_DESTINATIONS
 police:
 cir 8000000 bps, bc 250000 bytes
 conformed 408690519012173 bytes; actions:
 transmit
 exceeded 26635280726176 bytes; actions:
 drop
 conformed 303156000 bps, exceeded 19320000 bps
Class-map: class-default (match-any)
 1998983 packets
 Match: any

lin-router#show policy-map interface po50
 Port-channel50
Service-policy input: POLICE-EDUROAM-UPLOAD
Class-map: MATCH-EDUROAM-SOURCES (match-all)
 253107616302 packets
 Match: access-group name EDUROAM_SOURCES
 Match: flow record IPV4_SOURCES
 police:
 cir 8000000 bps, bc 250000 bytes
 conformed 73378531150889 bytes; actions:
 transmit
 exceeded 613359041557 bytes; actions:
 drop
 conformed 75872000 bps, exceeded 471000 bps
Class-map: class-default (match-any)
 332605099 packets
 Match: any

This output serves to provide us with information that tells us:

The QoS policy applied;
What packets it has been configured to match;
What the policy will do to the packets;
What packets conformed to the CIR and what action was taken;
What packets exceeded the CIR and what action was taken.

The output above of course only shows the primary path through the infrastructure. The non-zero values here indicate that our policies are acting on our traffic to and from eduroam clients. Success!

Final thoughts & points to note

So this does work very nicely in our scenario. However there were some things to take into account when contemplating using the Microflow policing feature and I suggest anyone also thinking about it consider the following points:

Plan your policies carefully before even touching a terminal – make sure you have a good handle on what flow records you’ll need to create and any associated ACLs or other configuration you’ll need;
Plan the placement of policies carefully – making sure you use the correct interfaces and remember that policing is an ingress action!
Make sure you select a Cisco platform with a large enough TCAM that holds enough Netflow entries – if you’re using switches in a VSS pair and MECs that connect across them like we did, then provided you’re load-sharing traffic between the physical switches relatively evenly (check which hashing algorithm your chosen channeling protocol is using for example), you could safely combine the Netflow TCAM capacity sizes of both switches and work with that figure as each physical switch’s own Netflow engine processes traffic independently;
Watch out for any existing Netflow configuration on interfaces – you cannot apply a ‘service-policy’ configuration to an interface already configured with ‘ip flow monitor’ for example.

Finally, bear in mind that the configuration listed here is what was applied to the 4500-X platform. Readers may find the configurations here are also useful for other platforms running IOS-XE, but you may also find some differences too!

Some platforms running IOS that support Flexible Netflow may also support the Microflow policing feature, though the configuration syntax is likely to be vastly different. Therefore I would always recommend you check out the Feature Navigator and other documentation available at cisco.com (will require a CCO login) for more information.

Many thanks for reading!

Network Development Team