Introduction
This blog post is intended to help ITSS in Oxford to better understand how the centrally provided network fits together with their own local networks. It is also hoped it will assist them in assessing the impact of any reboots we need to do for software and hardware updates.
Devices
The OUCS backbone consists of 12 Cisco Catalyst 6500s and around 200 Cisco Catalyst 3750s.
There are three types of 6500:
- 1 x JANET BGP router (COUCS3)
- 2 x Core Switches (BOUCS and BMUS)
- 9 x Aggregation Switches (CXYX)
The network is arranged in a dual star topology, with all ‘C’ Aggregation Routers having a ten gigabit fibre connection to both ‘B’ Core Switches.
From the diagram hopefully it is clear that either BOUCS or BMUS can be rebooted without an outage. If any of the C Routers are rebooted then the outage extends to all VLANs which rely on that 6500. If COUCS3 is rebooted then internal connections will not be impacted but our access to JANET will be. Note that we plan to install a second link in the near future.
There are three types of 3750s (FroDo or Front Door / point of presence switches):
- Building FroDo
- MDX FroDo
- Distributor FroDo
Generally, each building has its own FroDo. Where multiple Units share a building they will each have one port for their main connection and those using OWL phase 1 will share the centrally provided LIN (Location Independent) network ports. There is a FroDo in each of the Telecoms MDX rooms. Finally, due to the various routes which the fibre takes around the city, it is occasionally necessary to deploy a 3750 to aggregate additional FroDos. This is common in areas with a high density of annexes such as Iffley Road.
The management IP subnet allocated to the FroDo network is 172.16.0.0/20.
Numbering convention
Each ‘C’ Aggregation router has a corresponding number as follows:
Device | Number |
COUCS1 | 0 |
CENG | 1 |
CSUR | 2 |
CMUS | 3 |
CZOO | 5 |
CIND | 6 |
CASH | 7 |
CIHS | 8 |
COUCS2 | 9 |
Each FroDo is numbered based on the C Router it is connected to. For example, the first 3750 connected to COUCS1 will be called FroDo-1 and will resolve to 172.16.0.1. The first FroDo to connect to CZOO will be called FroDo-501 and will resolve to 172.16.5.1.
Connection Types
Each Unit has a main L3 connection. This is provided as a L2 VLAN presented on an access port on the building FroDo and trunked up to the adjacent C Router where the SVI is located. Some Units also have a L2 annexe VLAN. In this case the VLAN is trunked from the main site FroDo, through both core switches to the annexe FroDos, where it is presented as an access port in the annexe VLAN, with or without double tagging (Q-in-Q). This allows Units to put all their annexes behind one firewall for example, although it has the disadvantage of creating a large L2 (failure) domain which is a Very Bad Idea. See http://blogs.it.ox.ac.uk/networks/2011/02/04/mac-flaps-why-are-they-bad/ for more on this. Some Annexes have their own L3 connection which is less convenient but better network design. In a future version of the backbone we hope to be able to offer VPLS to provide both flexibility and scalability, but I digress.
Tracking where your connections are
Using the LG (Looking Glass) tool, available here: https://networks.oucs.ox.ac.uk/, you can check which device(s) your networks are fed from. LG will show you you where the L3 interfaces for your routed networks are, and which devices your annexes connect to at L2.
For your routed VLAN(s), the L2 connection will be to a FroDo, and that FroDo will connect directly to the C Router which hosts your L3 gateway as I mentioned earlier. For annexe sites which connect back at L2 to your main site, you will have visibility of the local device they connect to at L2. If this is a FroDo, there is no way for you to see which C Router that FroDo connects to using LG, although this can be deduced based on the third octet of the FroDo IP. The numbers in the table above show what that is for each C Router.
An example may help here. Let’s say Chaucer College connect to FroDo-501. Their main subnet might be 129.67.10.0/24. CZoo would host an SVI for VLAN 501 with an address of 129.67.10.254 (we always take the highest usable address in the subnet). That VLAN would be trunked to FroDo-501 and presented as an access port. Let’s say they own a building on Banbury Road and would like their users there to also be on 129.67.10.0/24. We would present VLAN 551 (for example) as an additional access port on FroDo-501 and trunk it through BOUCS and BMUS to COUCS1 and then FroDo-1 where it would be presented as an access port. Easy for the IT staff, as long as there are no loops at either end – they are propagated through the core and impact all users. Scale that up to 4 or 5 annexes and you see why I don’t like this and why we ask everyone to run STP. But I digress again…
So now you get an email from us saying we’re going to be rebooting all the 6500s for a software update over the summer, and would like to know which days your users will loose service during the announced maintenance period. Keep in mind that your annexe connections will go down when your main C Router is rebooted, and again when the uplink C Router from the annexe FroDo is rebooted if this is different. So with our example, the Chaucer College ITSS Fred Bloggs would check LG for their network and see something like this:
Looking Glass 1.4, using Oxford Directory 2.4 Given Vlan "501", displaying Unit Chaucer College Chaucer College (cha): itss01: Fred Bloggs fred.bloggs@chaucer.ox.ac.uk 4 further IT officers (use --all-itss to show) Registered networks 129.67.10.0/24: Chaucer Layer 3 interfaces czoo.backbone.ox.ac.uk Vlan501 (up) Chaucer 129.67.10.254/24 Registered vlans 501: Chaucer 551: Chaucer Annexes Layer 2 ports v501 chaucer.frodo.ox.ac.uk Gi1/0/1 [aGfu] Chaucer main v551 chaucer.frodo.ox.ac.uk Gi1/0/2 [aGfu] Chaucer BR Annexe banbury-road.frodo.ox.ac.uk Gi1/0/11 [aGfu] Chaucer BR Annexe
Now Fred wants to know what banbury-road.frodo.ox.ac.uk is connected to:
$ host banbury-road.frodo.ox.ac.uk banbury-road.frodo.ox.ac.uk has address 172.16.0.1
The third octet is 0 so the annexe relies on COUCS1 and CZOO for its connectivity.