Content filtering and the University

The issue of web content filtering is one which crops up every so often within the University. Do we give our users the freedom to visit any content they like, provided that University regulations (and the law of the land) are not broken, or should technical restrictions be put in place, for instance to stop them from viewing offensive content or just to stop them wasting entire afternoons on Facebook?

The subject of filtering is attracting considerable attention at a national level at present, with the Government and the major ISPs seemingly being at loggerheads over the matter.

The University’s position

The University’s position on web content-filtering is, as with many things, essentially to devolve the matter. Very little is done at a central level, and to some staff this comes as something of a surprise. The only exceptions are for certain malicious content, for instance domain names used by botnet controllers, or phishing sites which pose a particular threat to the University. Even there, the restrictions are not applied to all University networks, and we must be very careful to minimise the possibility of false-positives.

Our constituent colleges and departments are given the freedom to do their own thing. From the centre we have limited visibility as to what is done at a local level, other than by asking. Recent (unscientific) enquiries have given us some idea. In many cases the answer is nothing. Some others do some filtering of security threats only, one or two seek to limit access to Facebook and not much else, and a handful impose fairly stringent restrictions. From the responses, it seems departments are more likely to do so than our constituent colleges, perhaps reflecting concerns over confidential data and staff productivity, while colleges prefer not to impose restrictions on their student accommodation.

Some history

A university web proxy error message from around 2000

Back in 1999, the University introduced an intercepting web caching proxy, more for political and financial reasons than for technical ones. At the time the University was severely constrained by limited bandwidth on its external connection (somewhat less than many have on domestic broadband these days), there was no immediate prospect of upgrade, and transatlantic traffic would frequently slow to a crawl by mid-afternoon. Initially there were one or two investigations of content filtering but most were rapidly withdrawn after objections were raised. Some security-related blocks (for instance against the Code Red worm) proved extremely useful and caused little or no disruption to legitimate traffic. Bandwidth-limiting of certain content also proved fairly successful at controlling the limited available bandwidth without blocking content entirely, and offered a great degree of flexibility. Very few complaints were received, and following a major upgrade to the University’s connectivity, the restrictions (and later the proxy itself) were removed.

These days, if asked to advise regarding content filtering within the University network, how would we answer? As with many topics in security, the short answer is, “it depends”. Many of the perceived advantages introduce new risks. To a great extent it depends on what the college or department is looking to achieve, but it is worth noting that technical measures are frequently a poor solution to social problems. Doing almost anything can lead to accusations of “censorship” or denying “academic freedom” – indeed, there were a handful of dissenters when we introduced email antivirus filtering over a decade ago.

Malicious content

Blocking malicious content is relatively uncontroversial. Nevertheless it can be prone to false positives. For instance, you probably don’t need to block news articles entirely because they are pulling in malicious adverts from a third-party site, and your users will object. Nor do you necessarily want to block entire domains because of one malicious item. We’ve known the entire ox.ac.uk domain be blacklisted by one product on account of one of hundreds of servers within the University hosting “malicious” content. The offending content was a “white-hat” tool which had been offered at the same location for well over a decade. Actions need to be proportionate to the threat.

Nevertheless, most malware can be blocked without your users ever noticing. They’re more likely to notice blocks against phishing sites – indeed we’ve received occasional complaints from our users that our blocks are preventing them from “verifying their account” or whatever the phishers are asking them to do. Ensure your users are presented with a clear, informative error message, preferably one specific to phishing attacks.

Where do your users want to go today?

Things get trickier once you start blocking access to content that your users really want to reach. They’ll try and find a way round it, or find a friend who can. You might notice and block their workaround. They’ll soon find another. A colleague of ours assures us that many freshers will be perfectly capable of learning how to defeat any content-filtering on their college network – they’ve learned about such things in order to defeat tighter restrictions on their school (and possibly home) networks. When we tried blocking access to Napster and similar, we rapidly realised just how many services existed that offered our users a workaround – and those were just the ones that used HTTP on port 80. It rapidly turns into a huge game of “whack-a-mole” that we were bound to lose.

Driving traffic towards anonymising services, VPNs, Tor and so forth may present a bigger risk than the problem you are trying to address. Malicious traffic may no longer be blocked by firewalls or intrusion prevention systems, or detected by OxCERT’s monitoring. If a Tor user accidentally configures their system as an endpoint, you may find that an IP address on your network becomes the apparent source of external users’ traffic. Perhaps not a major concern if it allows foreign users to bypass the censorship of an oppressive regime, but very much your problem if it results in accusations of copyright infringement or accessing of child sexual abuse content.

Recreational versus “work” usage

Some workplaces desire to prevent students and/or staff from accessing sites unrelated to their work. For some that may just be a restriction on access to Facebook or Twitter. But not everyone’s usage of social media will always be recreational: it’s not uncommon to use them for publicity purposes, specialist news items and suchlike. Or perhaps, while not directly related to the job, they’re nevertheless beneficial – for instance the local bus company uses Facebook to post service updates. In severe weather, staff may regard access to timely information regarding transport, school closures and so forth as essential.

There is a risk that introducing filtering will upset users, especially if they consider it to interfere with work, or what they consider to be “reasonable” recreational usage. Policies need to be clearly communicated in advance, preferably with the reasoning behind them. The process for requesting exceptions needs to be straightforward, transparent and quick.

The strictest policy would be a default-deny, restricting users to accessing only those sites required as part of their work. Enumerating all such sites may be difficult, especially when webpages commonly pull in content from other sites.

As an example: consider a member of staff who was in the habit of playing a particular online game in his lunchbreak. One day, this resulted in his desktop becoming infected with a virus as a result of exploitation of a Java vulnerability.
The “knee-jerk” reaction in some organisations might be to impose stringent restrictions on usage of anything but directly work-related sites. But in some roles, that might be extremely difficult – the staff may need to access all sorts of sites as part of their job. Do you really want the overheads of dealing with requests for exceptions all the time? What is the problem you are actually trying to address. Why did this system get infected? While the site in question was recreational in nature, the user had no reason not to trust a site they’d used dozens of times before. On this occasion it happened to deliver a Java exploit, most likely through third-party content. Why did that exploit succeed? Because a critical Java patch had not been applied. Rather than putting resources into content-filters and strict restrictions and all the problems that brings, perhaps they should be directed at better, more timely patch management.
Clearly that will not always be appropriate. Desktops used to control a nuclear power plant probably shouldn’t be able to access arbitrary internet content.

Adult content

With apologies to Botticelli

Attempts to filter pornographic/offensive/adult content are not uncommon, but how is that defined? “I know it when I see it” won’t wash when configuring your firewall. Despite the launch of the .xxx domain, the internet is not conveniently segregated into “porn sites” and “acceptable content”. Many sites, Wikipedia included, may have some content deemed offensive but a lot considered perfectly acceptable. Trying to compile a list of “sites containing porn” is futile enough; trying to compile a list of every offending URL will be impossible.

Better filtering might use some kind of heuristics, but even so, where is the boundary between acceptable and unacceptable content? Attitudes vary hugely depending on culture, context, and indeed individuals. Nudity is extremely common in the Fine Arts (some very explicit content is openly displayed in the Musée d’Orsay in Paris). Some blocked sites you may consider perfectly legitimate but users may be too uncomfortable to request exceptions – examples frequently cited are sites dealing with issues of health or sexuality.

As one IT officer puts it: if a college blocks pages, you risk press accusations of censorship; if it doesn’t block anything, you risk stories about Oxford allowing students to browse smut. Damned if you do, damned if you don’t. And this article will no doubt be damned by some content filters for use of the word “damned”. Such “profanity”-based systems have, often rightfully, received a lot of flak over the years. Systems which block information about Scunthorpe, news of Hilary Swank, or discussion of Cleopatra’s bathing arrangements simply because they contain offensive strings are frankly unfit for purpose.

Illegal content

Many of the large domestic ISPs in the UK have taken action to block content which is illegal (at least under UK law) to access, based on a list of content managed by the Internet Watch Foundation (IWF). This certainly helps to guard innocent users against accidentally stumbling across some deeply unpleasant content, and will likely deter some of those with no more than a casual curiosity regarding such material, but as mentioned above, those sufficiently determined will find a way round the restrictions.

The underlying Cleanfeed technology behind the blocks is certainly ingenious but far from perfect. In 2008 the blacklisting of an item on Wikipedia drew considerable attention to the system, caused problems for many legitimate users of Wikipedia, and became a textbook example of the Streisand Effect. The Cleanfeed system has subsequently been used to impose blocks beyond its original remit, for instance in order to
comply with court orders to block piracy sites, and may be leveraged for further purposes according to the whims of future courts or governments.

What alternatives exist?

Warn users then let them proceed at their own risk?

The alternatives depend on the reasons for wanting filtering in place. Clearly, doing nothing will be technically possible, but may not go down well from a political point of view. If you’re primarily worried about bandwidth utilisation, some form of traffic shaping may be acceptable. If you’re concerned about people viewing offensive content in a general computer room, a simple approach that has worked in the past is to print out the IT Regulations in a large font, highlight the relevant clauses, and stick it on the wall as a reminder. If you have concerns about under-18s on your network as part of a summer school, you may be able to shift the responsibility onto the summer school organisers. If the concern is over staff “wasting” time on Facebook during working hours, take a step back. Are they getting the work done to the satisfaction of their line managers, and if so, is a little recreational internet usage actually a problem? If they’re not working hard enough, are technical measures really the most appropriate solution? There are many other distractions that may affect staff performance and a lot of them would never be considered a matter for IT to deal with. One department told us that while they block malicious content, trying to view content in other categories (e.g. “adult”, “illegal drugs”, “gambling”) will simply produce a warning message; users can then proceed at their own risk.

Conclusions

We certainly don’t wish to give the impression that content-filtering is always to be avoided. As with many things, there are pros and cons, and this post has concentrated on the negative aspects which may not immediately be apparent. What seems like a good idea in the first instance may have significant ramifications. What we do suggest is that those involved in determining policy are fully appraised of both the advantages and disadvantages, and appreciate that in most cases a perfect solution will be impossible to achieve. Users obviously need to be made aware of policies and revisions, whether enforced by technical or social means, and any monitoring in place including details of what is logged, to whom it is visible, and under what circumstances it will be used.

How well will people be informed in the case of government-mandated filtering imposed by the major ISPs? I’m not hopeful. My domestic broadband is obtained through a relatively small ISP, who to the best of my knowledge impose no filtering whatsoever. While my domestic broadband provider impose no filtering (currently), my cellular data provider does filter “adult” content by default (not that I have found this to be a problem). I don’t recall them going to any effort to ensure I was aware of this, the possible consequences, or of the procedures required for opting out. If the government get their way with “default-on” filtering, whether domestic ISPs are likely to do any better remains to be seen.

Posted in Web Security | Comments Off on Content filtering and the University

Comments are closed.