Crowdsourcing: the essentials

This post was written by Liz Masterman for the ‘News from Academic IT” blog and kindly cross-posted here.

The term ‘crowdsourcing’ crops up almost daily in the media, but it’s probably a fair guess that many people have only a general idea about how it works. In this article we look briefly at some of the characteristics of crowdsourcing initiatives, illustrated by four current and past projects conducted at the University:

The first stage of classification in Galaxy Zoo

Zooniverse is a platform that supports a number of crowdsourcing projects led by Oxford researchers. The original project, Galaxy Zoo, invited the public to help classify over a million galaxies. Current projects include Penguin Watch and Shakespeare’s World.

The Medieval Texts Translation Project is a by-product of Fiona Whelan’s doctoral research. It brings together scholars, academics, students and amateurs to translate medieval texts in order to make them more accessible to both the academic community and the general public.

Art UK (previously Your Paintings) is a collaboration between the Public Catalogue Foundation and the BBC to tag digital copies of paintings in public collections around the UK. Dr Kathryn Eccles of the OII has been researching the ‘virtual volunteers’, or ‘taggers’, taking part in the project to explore their motivation and find out what impact participation has had on them.

The Great War Archive ran a campaign to build an online collection of family memorabilia to accompany an archive of digitised manuscripts by major poets of the First World War. Contributions were collected both electronically, with members of the public uploading images to the website themselves and face-to-face, through ‘roadshows’ to which people brought their artefacts to be photographed or scanned by the project team.

Which comes first: the crowd or the source?

It’s possible to distinguish two main approaches to crowdsourcing, which differ from each other in their starting-points: content and crowd.

1. Content as the starting-point
This approach covers ‘a diverse range of activities and projects involving the public doing something to, or with, content’ (Dunn and Hedges 2014). The output from such activities can be some form of improvement to the content (e.g. Medieval Texts, Art UK) or, when the crowd is participating in research, a body of research data (e.g. Zooniverse).

Sketch by Percy Matthews contributed to the Great War Archive by his grandson

2. The crowd as the starting-point
In this approach the public is asked ‘to help contribute to shared goals’ (Ridge, 2014): for example, by contributing artefacts to a collection, or by recording actions or behaviours. Once again, these goals can be either content (a collection of artefacts: e.g. Great War Archive) or research data (e.g. the RSPB’s annual Big Garden Birdwatch).

These two models are, of course, a simplification, and a number of variations can easily be identified. For example, digital images of memorabilia contributed by the crowd to the Great War Archive may serve as valuable sources for historians of the period.

All projects great and small…

One of the most striking characteristics of these crowdsourcing projects is their difference in scale. Galaxy Zoo attracted hundreds of thousands of contributors from across the world, yet Medieval Texts Translation achieved striking success with just a handful of contributors. Furthermore, technology (and the funding for it!) need not be a barrier: while Zooniverse and the Great War Archive run on purpose-built platforms, Fiona Whelan used an assemblage of free and open-source tools for her translation project: Google Docs, WordPress and Titanpad (a collaborative writing app), together with social media (Facebook and Twitter) to reach her community of volunteers.

A selection of projects on the Zooniverse platform

Blazing the trail

Two projects in particular have eased the way for other organisations and individuals to follow in their footsteps.

With funding from Google, Zooniverse has developed a platform on which crowdsourcing sites can be set up very quickly and at no cost. It is currently hosting 10 live projects.

A key legacy of The Great War Archive is the Oxford Community Collections Model and an accompanying service, RunCoCo, which helps those who wish to build digital collections using a combination of online crowdsourcing and targeted face-to-face interaction. The model has been used with a wide range of subsequent projects, both large and small, including Europeana 1914-18, Europeana 1989, Great Famine Voices, the Woruldhord collection of teaching materials for Old English and the Lower Umpqua Community Historical Archive.

The role of the crowd: collaboration and consensus

Extract from a poem in the Medieval Texts Translation Project

It can be tempting to focus on the eye-catching images and arresting texts that abound on crowdsourcing sites: colonies of comical-looking penguins, a soldier’s helmet pierced with shrapnel, a landscape so vivid one could step right into it, or the tale of the witch with her cow-sucking bag. But in many ways the real stars of the show are the members of the public who contribute their time, effort, ideas and artefacts. Their role vis-à-vis the research community is an active one: they are considered members of that community: ‘citizen scientists’, even though they may have few or no academic qualifications themselves.

Scientific decisions are often made on the basis of consensus. In Galaxy Zoo each galaxy is viewed by 40 volunteers and the final classification is determined on the basis of the majority judgement. The Medieval Texts Translation Project used a collaborative writing tool which supported discussion of, for example, vocabulary and interpretation. This enabled participants to engage with the project in different ways according to their preference; some would translate a whole poem from scratch, while others preferred to edit and comment on the translations already made.

From the ‘Tagger’ page on Art UK

Both Fiona (from interacting with her participants) and Kathryn (through her research) emphasise the importance of building a sense of community among the volunteers in a project and, crucially, a sense of ownership in the collective work and of making a positive contribution to current and future scholarship. In Galaxy Zoo, this was taken to its logical conclusion in naming volunteers as co-authors of a research paper reporting the discovery of a new astrophysical object. (Interestingly, this discovery provides a strong argument against replacing human crowds with computers, even though some efforts have been made in that direction, such as the automated tagging of artworks.)

Even where participation doesn’t lead to scientific discovery, participants in crowdsourcing projects can derive great personal satisfaction from the experience. Contributors to the Europeana 1914-18 collection filmed for Irish TV news spoke of the gratification of being able to share a family treasure and the story associated with it with the world. In her research with the Art UK taggers, Kathryn found that participation increased their engagement with museums and galleries and a way to see paintings from galleries which they were unable to visit for themselves. Not only did they also grow in confidence in looking at art, they also found that their use of language improved overall. And for some, tagging fulfilled a therapeutic function, providing a welcome distraction from difficult personal situations. Virtual volunteering therefore, can open up possibilities for ‘anywhere, anytime’ volunteering, and with it a sense of purpose and personal value.

References:
Ridge, M. (2014). Crowdsourcing our Cultural Heritage. London: Routledge.
Dunn, S. & Hedges, M. (2012). Crowd-Sourcing Scoping Study: Engaging the Crowd with Humanities Research. Arts & Humanities Research Council.

This article has been compiled from notes made at ‘Crowdsourcing for Impact’, a forum held on 4 March 2016 at IT Services as part of Academic IT’s annual Engage programme. The speakers were Dr Grant Miller (Zooniverse), Kate Lindsay (Great War Archive) and Dr Kathryn Eccles (Art UK research project). The convenor was Dr Ylva Berglund Prytz, who also presented on behalf of Dr Fiona Whelan. All interpretations and errors are mine.