The Oxford Text Archive in 2016

Analysis of the logs for downloads of resources from the Oxford Text Archive in the calendar year 2016 reveal a continuing increase in usage. A few years ago there was a big leap in downloads thanks to the ingest of a large number of texts from the Text Creation Partnership which became available via the OTA, when it became legally possible to share them openly.

Other factors aiding increased usage of the OTA include:

  • BNC free for download: during 2016 the British National Corpus was made available for direct download without having to fill in a form and wait for authorization, and as a result downloads continue to increase;
  • Freeing the texts: an ongoing programme of reassessing legacy data, and, where possible, removing access restrictions;
  • Higher visibility: resource discovery via the CLARIN Virtual Language Observatory, which aggregates OTA records and offers a new way for users to find the texts;
  • Shibbolization: a small number of resources are available currently for UK users only, but also slowly being opened up Europe-wide thanks to the CLARIN and EduGAIN;
  • More digital research: demand grows as more users in the humanities start to engage in digital scholarship.

The grand total for the discrete downloads of resources from the Oxford Text Archive was 1263810, or 1.26 million. Each of these represents the successful download of the content of a resource, and the numbers were calculated after filtering out all hits from spiders, crawlers, robots and other automated processes, and ignoring failed downloads.  The total is an increase of around 38% on last year’s total. Of these 395812 could be identified as originating from users in the University of Oxford, approximately 40%, and more than double the number from last year. Of the total downloads, more than 99.6% were direct downloads of resources made available at open URLs, the rest made up of the various resources where access restrictions require authorization.

Here are this year’s top ten:

Number of downloads Title Author ID Class
9313 The poems of John Keats Keats, John, 1795-1821 3259 text
8351 VOICE: Vienna-Oxford International Corpus of English Barbara Seidlhofer 2542 corpus
6543 British National Corpus, XML edition BNC Consortium 2554 corpus
4936 British National Corpus, Baby edition BNC Consortium 2553 corpus
4616 The four seasons, and other poems. By James Thomson Thomson, James, 1700-1748. 3549 ECCO
4407 An account of the proceedings against the rebels, and other prisoners, tried before the Lord Chief Justice Jefferies: and other judges in the west of England, in 1685. for taking arms under the Duke of Monmouth. … To which is prefix’d, the Duke of Monmouth’s, the Earl of Argyle’s, and the Pretender’s declarations, that the reader may the better judge of the cause of the several rebellions. 4431 ECCO
3696 Beggar’s opera. Libretto. Gay, John, 1685-1732 3257 text
3663 New York newspaper advertisements and news items: 1777-1779 3151 text
3613 The history of the most noble Order of the Garter: Wherein is set forth an account of the town, castle, chappel, and college of Windsor; … To which is prefix’d, a discourse of knighthood in general, … Collected by Elias Ashmole, … The whole illustrated with proper sculptures. Ashmole, Elias, 1617-1692. 5268 ECCO
3564 The peerage of Scotland: containing an historical and genealogical account of the nobility of that Kingdom. … By George Crawfurd, Esq;. Crawford, George, fl. 1710. 5301 ECCO

There is also a  table with the top 20 downloads of 2016. Overall, more than 36000 different resources were downloaded.

The table below shows the most popular items with access restrictions, which required an online application and manual authorization before they could be downloaded. There were 4401 of these downloads – over the year an average of more than ten per day which needed to be manually authorized by a member of staff. Last year there were 3681. Some tf the resources below were made freely available during the year, and so were accessed via direct download as well.

Number of downloads ID Title Notes
2664 British National Corpus, XML edition 2554 Also 3543 direct downloads and 356 via Shibboleth
320 British National Corpus, Baby edition 2553 Also 4439 direct downloads  and 176 via Shibboleth
236 The Lancaster Corpus of Mandarin Chinese 2474
111 Helsinki corpus of English texts 1477
97 British Academic Written English Corpus 2539 Also with 663 direct downloads
97 Complete corpus of Old English: the Toronto dictionary of Old English corpus / compiled by the University of Toronto Centre for Medieval Studies 0163
74 Parsed Corpus of Early English Correspondence (PCEEC) 2510
67 British Academic Spoken English corpus 2525
55 Cat on a hot tin roof / Tennessee Williams 1233
43 A Corpus of English Dialogues 1560-1760 (CED) 2507
43 British National Corpus Sampler 2551
43 The York-Toronto-Helsinki Parsed Corpus of Old English prose (YCOE) 2462
31 Dictionary of Old English Corpus in Electronic Form (DOEC) 2488

There were 556 downloads from the experimental site hosted by the Oxford e-Research Centre, where users can download one of a small number of resources (of which the BNC is the most popular) by authenticating with their institutional single sign-on. This is an increase from 321 last year, despite some periods of down-time for the service. Only thirty-six of these downloads were from the University of Oxford.

Posted in Uncategorized | Comments Off on The Oxford Text Archive in 2016

Comments are closed.