This blog has been archived and will not be updated further. For more IT Services blogs, see the front page.
Working with the data from Intute presented us with a few fairly knotty problems when it came to re-purposing the data for the ARCH project. If I hadn’t had some years with the Intute project myself working with the data at the database table level, I would, perhaps, have been hard pushed to make much sense of it. Given the nature of Intute itself and its task of unifying half a dozen rather disparate ways of collecting, classifying, and cataloguing data across its range of subjects, that this had to be achieved within an impossibly tight time-frame, and given that the people who would be working with this new unified database were going to be largely the same people who had worked with the individual subject data at the different subject hubs, and that their existing users were going to have certain expectations about the way the data was classified and presented, it was, perhaps, inevitable that a pragmatic, somewhat piecemeal, approach was taken to building up the database structure. The original subject hubs often had a requirement for data fields which none of the others were likely to use; sometimes in a one to many relationship to the specific record. This lead to a profusion of columns and whole tables which would have no relevance to the majority of the records but were added to maintain existing practices at the subject hubs. Rather than an opportunity to rethink and rebuild an internet resource catalogue system from the ground up, Intute became a sprawling and unweildy hybrid of its component parts.
Even though, for the purposes of ARCH I was only really interested in three of the 20+ tables which made up the Intute DB, I was still faced with the problem of the cryptic and single direction way that these two tables were linked. The tables I was interested in were the record table, in which the majority of the usable and interesting data was stored, the classification table where the subject taxonomies were stored and used to generate the web site browse structure, and the record_admin table which contains certain metadata about the record itself (as opposed to the resource the record points at) such as who catalogued it, on which date, its status as live or otherwise on the public site etc. The record table had some fields where multiple values were stored in the field itself as semi-colon seperated values, while others had a reference to an id field in another table. It was rather a struggle to seperate all this out and to hive the arts and humanities records (16,000+) off from the other subjects (150,000+ records). Having done this, and saved the results into a new table (ahrecord) I then tried to work out how the id numbers in the classification field in the record table related to the actual rows of the classification table. This proved to be rather less than trivial given that there was no way in SQL to actually join the two tables.
At this point there is no use being made of the record_admin table, which contains metadata for the revprd itself, as opposed to the reasource. In a complete system this table, or one derived from it, would be used to track community contributions and take care of the ranking system and associated privileges. In the original Intute database, the record_admin table was linked, via an id field, to the editors table which contained personal and contact information for the cataloguers who originally created these records. For reasons of data-protection policy and lacking the resources to chase up and obtain permission from these legacy cataloguers, it was decided not to use this data and, instead, mark all records as originating from a generic ‘Intute Staff’ user. It should be noted that, if ARCH recruited gained registered users, it would leave a similar legacy problem in its data to any future projects.
A quick survey of the translated and transferred data in the new ARCH system suggests that a great many of the URLs in the Intute data and no longer working. An immediate task before making the site fully public would be to find and flag these dead links and put them into a pool available for new registrants to work with. The basic task of the lowest Arch rank would be in tracking down these lost resources and editing in the revised url, or flagging the resource as no longer available. This would provide a ready path for progression to a higher rank and gain a number of badges en route.
Game mechanics have a lot to offer the worlds of business and academia in terms of incentivising staff and students. The traditional incentives of carrot and stick; reward for success and punishment for failufre have been shown in repeated studies to actually be counter productive for anything except simple manual tasks. For any task which requires even a low level of cognitive involvement it has been shown that higher rewards actually degrade performance. What is it, then, which actually motivates people? The studies looked at some of the things that people do in their spare, unremunaritive, time and landed on three keywords.
Autonomy : The desire to be self directed.
Mastery: The desire to be good at something, or better at something.
Purpose: The feeling that you are making a contribution to the greater good.
Game mechanics offer some simple and concrete methods for fostering such motivators and many people feel that such game mechanics, or ‘gamification’ will become increasingly important in the coming decade. Seth Priebatsch, self described tech ninja at start-up SCVNGR says “The last decade saw the building of the social layer. The next decade will be the decade when the game layer is built.” There are three particular sorts of game mechanics which could be useful to the ARCH ranking system. These are: The Progression Dynamic, in which a user is shown their progress through a specific task as well as their overall progress within the system. A simple green bar with some sort of percentage complete figure is a surprisingly strong motivator; people want to move that bar across to 100% and thus gain the next level, or win rewards and such. The social network Linkdin, which focusses on fostering professional connections, has such a progress bar to indicate how much of your profile you have filled in and it seems to work well for them in terms of motivating people to complete this task. The rewards for completing a given progression need not only be advancedment to the next level, where the progression starts again. Many sites and services use the idea of badges to rewards particular activities and these too have been shown to be a strong motivator. In terms of ARCH one could, besides progressing up the ranks as described in the previous article, award badges for specific activities or achievements. For instance there could be a ‘Polyglot’ badge for successfully performing operations on records in a nuber of subject areas, or an ‘Attendance’ badge for logging in and performing work for a given number of consecutive days. It should be possible to measure the time taken to complete the work on a single record and, as the user becomes more adept, to issue badges based on increasing throughput. These badges, and a person’s general progress could be displayed on a person’s personal profile page and be publically visible, thus promoting a certain amount of healthy rivalry or competition amongst users. This then forms part of another basic game mechanic which we’ll call Community Discovery. Rivalry within a community through the use of leaderboards and such can be good in itself, but the possibility of tasks which require the co=operation of two or more users to achieve also suggests itself. Such co-operative working is a significant factor in the process of building a community and giving it a sense of cohesion.
In the above ways the ARCH ranking system could be said to engender all three of the motivating factors which studies have shown to be the real incentivisers for people: Purpose, as with Wikipedia, a feeling that you’re contributing to a common good. Mastery: progression through the ranks and the accumulation of badges give immediate and tangible feedback to the user that they are becoming better at performing their tasks. Autonomy: the increased levels of access for the higher ARCH ranks are concrete examples of a user’s progress conferring more autonomy upon them.
Bringing gamification to an academic endevour such as ARCH may seem to some to be trivialising a process which requires a lot of hard work; the original Intute catalogue, and the Humbul catalogue before it, took many man hours of painstaking work to assemble and maintain. Making it all in to a game may appear initially to cheapen these efforts, but, as the author Ian McEwan says in his novel The Cild In Time, “no one works more productively than a child at play.”
It was proposed that our users be ranked according to how much they have contributed to the upkeep and growth of the ARCH data. We could use pretty much any metaphor to name these ranks, but, for the purposes of this article I will go with the idea of an architect’s office. There would be four ranks which would be, in ascending order: Copy Tracer, Draftsman, Master Builder, Architect. The actions available to each rank increase with level and include powers over some lower ranks in terms of their progression. In this way the comunity would become self sustaining and self managing. In more detail the ranks would have the following functions:
This is the default rank given to a user when they are first signed up and verified. This sign-up and verification would be automatic and would involve filling in a simple web form, with the usual basic details plus maybe some indication of the users areas of interest. An email would then be sent to the registered address and include a link back to the site to a verification script (we used much this system for users of MyIntute and, while there were occassional glitches with the outgoing verification email, the number of spam signups was negligable).
Once verified the user would have access to a limited view of the back-end record editing system. They would be able to edit, say, the title, URL, and other metadata for existing records only. The description would remain read-only for this rank. The user would the work through these records chosen either for their subject area or based on the date the record was last reviewed. Each record checked, corrected and dealt with would earn the user points towards their eventual promotion up the ranks. It would be possible to weight the awarded points to do things like encourage work on records which haven’t been reviewed for a while; the older the last review date the more points available for tackling the record. Fixing broken links particularly could carry a premium. Some sort of automated link-checker could produce a list of such broken URLs for the user to work with.
Any work carried out by this level of user would be subject to approval by higher ranking users.
Can do all the above functions plus be able to flag Copy Tracer work as approved. This rank would also have the ability to add new records within a defined subject area (based on their work at this and lower rank). Their progression depends on the same points accumulation system as the Copy Tracer, but they would be able to earn additional points for their oversight of Copy Tracer work and for submitting new records.
All the above functions including adding new records. The Master Builder also receives batches of Draftsman approved records from the Copy Tracer level in all subjects and would be able to approve this work to the live sight. The Master Builder would also be able to add records directly to the live site; probably still within a limited subject area based on their previous work. The Master Builder would be able to directly add new records in any subject area and could perhaps put together small sub-collections to act as featured resources in particular subject areas. This latter idea borrows from the Flickr idea of user created galleries where users can choose a strictly limited number of items to include in a gallery or collection designed to illustrate or support a specific aspect of a subject; maybe based on the users own interests, or current affairs, or anything else suitable. Creation, and su
All the above plus the Architect can approve the progression from Draftsman to Master Builder (progression from Copy Tracer to Draftsman would be more or less automatic based on the number of accumulated points, with the number of edits actually making it through to the live site being factored in). The Architect could also have the ability to weight the points awarded for work in specific subject areas, thus giving them a measure of strategic responsibility in the development and curation of the collection. Elevation to the rank of Architect would again be based on the work done at lower ranks, but would also perhaps require some sort of final approval by project staff.
The ARCH project reuses the original Intute catalogue to create a new community based resource.
The original catalogue consisted of reviewed web sites for specific subjects – in Oxford’s case for Arts and Humanities resources, which were selected for their high quality and appropriateness for higher education courses of study. The new community web site is a pilot study to see if this type of community based resource can work.
The project is developing software for a community type of system where the community members are involved with adding and editing items and using the resource. Initially the project is intended just for Arts and Humanities use and specifically as a pilot to see a) how feasible the community resource would be and b) to develop the software that could be re-used. There is a questionnaire to elicit views from potential community members.
The system and demonstrators will be available shortly, and will be demonstrated at workshops.