Return of the Advent

Regular readers may be unsurprised to learn that Sysdev have once more acquired a Lego Star Wars advent calendar, to remember a good friend by. We’ll be updating this blog post each day with our adventures in model building (with possible delays at weekends). The first update should appear very shortly…

Continue reading

Posted in Star Wars Advent | Tagged | Leave a comment

Team vacancies

Once more it’s time to alert readers of vacancies in our team. This time we have both a sysadmin and team leader position vacant (although the latter is only available to applicants internal to the University and its constituent colleges).

The sysadmin post is similar to those we’ve listed before on this blog, but with a focus on Drupal deployment – as IT Services builds a large-scale Drupal deployment to be offered as a central service across the University. This is a Grade 8 post which closes on 30th October.

The IAM team leader post is a permanent appointment at Grade 9 and is responsible for the management and development of new and existing IAM services offered by the team. Closing date: 20th October.

Posted in Vacancies | Leave a comment

A dandelion’s tale; an internship at sysdev

So, during this summer, I had an unique opportunity – to be a part of the team of ninjas sysadmins at the Systems Development and Support Section at the University of Oxford as part of the IT Services Internship Programme. I was a part of the Infrastructure and Hosting team (IAH), which along with the Identity and Access Mangement team (IAM) comprise the Systems Development and Support Section. My work was supervised by Dominic Hargreaves and Dave Stewart of IAH. (That’s it, I promise there will be no more acronyms!)

Over a period of two months, I completed a series of miscellaneous tasks, mostly in the area of increasing efficiency in a few of the tools and writing network visualisation tools to get an overview of the topology and dependencies among the servers that the IAH team have to support and maintain.

Settling in: solving papercut bugs

The first fortnight was spent in getting accustomed to the daily tools used by the team — such as request-tracker, the ticketing system; and getting acquainted with the wiki, which serves as a knowledge base for common procedures. I fixed a few tickets, mostly trivial changes such as changing email addresses from help@oucs.ox.ac.uk to help@it.ox.ac.uk reflecting the change in name of the department in 2012. I also updated the documentation, adding manual pages for tools, like adding short options to a local build tool.

I also finished and deployed a website which reports on the success/failure/last updated status of the mirrors. This utility can be seen at http://mirror.ox.ac.uk/status.

Making bacman2 faster

bacman2 is the homegrown backup utility used by the IAH team to manage backups for the servers under their administrative control. It can perform rsync based filesystem backups, as well as database backups, which are done by various submodules of bacman2.

Configuration of bacman2 is done using YAML files. YAML is a human-readable format which is terser than XML and easier to read than JSON while being compatible with JSON (YAML is effectively a superset of JSON).

However the archives list of bacman2 was also kept in YAML. As the Perl YAML module is not very efficient at loading such a large YAML file (containing 300k records or more), this would cause frequent lockups as the bacman2 process blocked on updating the YAML file.

The solution was to use a proper database for this. Since the archives YAML file was not replicated and was local to only one system, it made sense to use a lightweight file-based database system like SQLite, which also has good bindings for Perl. The archives list was migrated to SQLite without any data loss.

The migration to SQLite solved the frequent locking problems and was much faster. Addition of new backups to the archive list which previously took upto a minute because of the requirement to parse the entire YAML file into memory and write it out to disk, is now instantaneous.

Network topology: dandelion

The last, and in my opinion most interesting part of the internship was developing a network topology diagram of the network of machines managed by the team. At the moment of writing there are 152 systems connected to various switches. Understanding and visualising the connectivity of these systems is critical to swift identification and localisation of any emergent problems.

An associated problem is that of host or server startup order. The various servers run by sysdev, are associated with various services that the University needs. The services are categorised by tiers, with Tier 1 being the highest priority services such as the central authentication system, with Tier 3 and 4 being the lowest priority systems.

In the event of a total or partial shutdown of the servers, it is important to know the order in which the servers should be started as some servers provide services that are depended on by other servers.

Both these tools were combined into one tool which gathers data from various sources like the configuration repository generated by the rb3 tool (the configuration management tool used and developed in-house at Oxford, available as open source) and the Cisco switch configurations and generates graphs using D3.js. The name dandelion came about from the remark by a member of the team that the network topology graph looked very much like one. The graphs allow searching for hosts and showing their properties.

dandelion-example

I wrote the dandelion utility as a module so that it could be reused for similar tools, and some example tools were written which can report on, for example, the Debian versions of the various systems, searching servers which have particular properties, or reporting on the various services that a particular server runs, and its relationship with the other servers on the network.

Future Work

Further work can always be done in the area of automated configuration management and visualisation, possibly by applying machine learning techniques to the configuration repository. In the last week of the internship, I was working on a similarity tool, using the dandelion framework, which gives a similarity weight between two servers on the network, based on how many properties they have in common (after removing the properties common to most systems). Such a similarity weight would identify clusters of servers performing a similar task and could be later used to show a graph of such clusterings, or be part of an utility which monitors resilience of the network (for example, it could offer suggestions about moving servers performing similar tasks into geographically more distributed locations, to reduce single points of failure).

Acknowledgements

I would like to thank Dominic Hargreaves and Dave Stewart for their excellent guidance throughout the internship. I would also like to thank Peter Grandi and Kristian Kocher, and the members of the adjoining Identity and Access Management (IAM) team for the many excellent conversations we had over beer and burritos :)

Posted in Uncategorized | Tagged | Leave a comment

An Advent Adventure – The Advent Strikes Back

Last year Sysdev were given a Lego Star Wars advent calendar by a long-standing colleague and friend, who sadly passed away shortly after Christmas.  However, in memory of him, and because the team enjoyed the process, we have acquired another advent calendar this year.

Again, we will be opening a door every day, building the model and adding a picture to the blog.  Please note that practicalities at weekends may mean that some additions won’t appear until the following working day.

Continue reading

Posted in Star Wars Advent | Leave a comment

Debconf, and encouraging contributions

Last week I was fortunate enough to be able to go to the annual Debian developers’ conference, Debconf, which this year was held in Vaumarcus, Switzerland, in a glorious setting with views of a lake and mountains (if you look closely you can see evidence of the Debian swirl):

View from Debconf13

I would like to thank IT Services for enabling me to go on this conference, taking time out of a busy schedule at work, as well as the Debconf team for organising an excellent conference.

Debian is used extensively by my team as a primary hosting platform for our services, and as a community based “Universal Operating System” with a strong focus on freedoms and openness, it fits well with the team culture. Whilst I became involved in Debian before taking up my role at IT Services (formerly OUCS), I find my involvement in Debian useful as a way of getting more value out of Debian for the team (for example by being familiar with Debian processes and developments and being able to contribute back work which has evolved within the team). The most visible evidence of this are the RT packages which I have been maintaining since 2008, originally as part of a project to upgrade our own instance to the then-current 3.8.

The conference took place over the course of a full (8 day) week and combined the traditional presentation sessions with ample opportunities to meet with follow developers and contributors (many for the first time face to face) and work closely as a team on some objectives (including, for me, the inclusion of Perl 5.18 into Debian; I have been co-maintaining the Perl packages for the past few years). Some of the highlighted events for me were:

  • Freedombox – an excellent presentation from Bdale Garbee on the Freedombox project, which aims to deliver an easy to use bundle of software for installation in cheap home servers, to enable users to keep control of their own data rather than putting it at the mercy of governments
  • The Technical committee BoF – it was interesting to hear a bit more about how the tech-ctte, one of the few formal management structures in Debian, operates in order to resolve technical issues or disputes
  • use Perl – the annual perl packagers’ meeting: one of the teams I am most actively involved in as a side effect of co-maintaining Perl; I met several team members for the first time at this Debconf and this ended up being a very productive exercise
  • Debian on Google Compute Engine and AWS Debian – these talks from David McWherter and James Bromberger respectively were interesting updates for someone like me who has not yet had a chance to try out these services. It was reassuring to note that the AWS images of Debian are now “official” in that they are minimal images built by a Debian developer.

The most interesting talk from a direct Sysdev perspective was by martin f. krafft, who presented his tool reclass, an ‘recursive external node classification’ engine. This is a system designed to integrate with a number of modern configuration management tools such as puppet, salt and ansible, which behaves eerily like Sysdev’s own rb3 tool (see the original paper) though with possibly fewer of the annoying quirks! It addresses the need to minimise repetition in large installations via multiple inheritence, and acts as a layer between the user and the configuration managements tools itself. Following the talk we had a chance to explore some of the issues in more detail with a separate BoF session. This is definitely something I’ll be keeping in mind as we discuss a coordinated strategy for configuration management across the new, larger department.

The above were only some of the events that made Debconf so enjoyable for me – there were plenty of social occasions including the now famous Cheese and Wine BoF, and Debian’s 20th birthday, which was an afternoon of talks targetted at a wider audience than regular Debconf attendees followed by a barbecue, and a huge birthday cake.

Photo credit: Christian Perrier

Scenic hacklab. Photo credit: Christian Perrier

If any of this whets your appetite (sorry, no cake), you might be interested in looking at the video archive of many of the scheduled talks and events, some of which I was involved in producing, as a new member of the Debconf video team.

So. where does the ‘encouraging contributions’ part of this post come in? A recent personal objective of mine at work has been to help people within IT Services and particular my own team to get more directly and deeply involved with Debian development. Most of the team already has a lot of the relevant expertise, as we deploy all software via Debian packages and so end up packaging or modifying other packages of quite a bit of our own software and those of others. To that end, I am hoping to run a Debian packaging workshop/bug squashing party at IT Services later this year or early next year, and see if I can persuade some of my colleagues to maintain packages for Debian, to increase our contributions back to a project which provides so much value to us. I chatted with quite a few people at Debconf about this type of event and got some useful ideas for how to run it.

If you are reading this from IT Services or indeed across the University/Oxford and would be interested in taking part in such an event, I’d love to hear from you. You can comment on this blog or email me at my University email address.

Photo credit: Christian Perrier

Photo credit: Christian Perrier

Posted in Conferences, Uncategorized | 1 Comment