Today sees the official release of the Sudamih Researcher Requirements Report. We have compiled the Report on the basis of interviews conducted with researchers across the humanities disciplines at Oxford. It summarises current data management practices amongst humanities researchers and assesses demand for training and for the development of a ‘database as a service’ – two of the key anticipated outputs of the Sudamih Project. Although the participants in the interviews are all Oxford-based, there is little to suggest that the way humanities scholars approach data management here is very different from at other UK Universities, so we hope that this report will be of broad interest to anyone involved in research service provision for the humanities or data management.
Scholars in the humanities employ a huge range of sources and approaches in their research, making it dangerous to generalise too freely about humanities research data. Nevertheless, one can tentatively identify distinctions between data compiled for humanities research and that generated in the course of scientific investigation. Firstly, humanities data tends to be gathered from existing sources rather than created from scratch, with the possible exception of some linguistic data gathered under ‘laboratory’ conditions. The diverse nature of the sources that humanities researchers gather their information from often results in data which is inconsistent, incomplete, or which relies to a degree on conscious selection and interpretation. All of these factors must be understood before the data can be properly analysed. However, whilst data in the humanities may be not be as straightforwardly ‘reliable’ as much scientific data, this is not to say that it has less academic value. On the contrary, the intellectual value of humanities research data often has exceptional longevity, tending not to depreciate over time. A database of Roman cities is potentially of as much use to researchers in fifty years time as it is today, provided it is not rendered obsolete through technological change. Humanities scholarship often aggregates to a ‘life’s work’ body of research, with any given researcher often wishing to go back to old datasets in order to find new information.
The challenge faced by Sudamih and other JISC-funded research data infrastructure projects is to build the systems by which researchers can preserve their data so that they are not obsolescent in fifty years time and can still be used both by the researcher that created them and potentially by others. This requires documentation, reliable long-term storage, potential migration to more modern data formats, and various other curation activities. It also places responsibilities on the researcher himself, to organise and structure their information so that it is clear and usable, and to consider the future of their data at the stage of its conception. When done well, good data management should bring obvious benefits to the researcher who created the data, as well as potentially extending that usefulness of the data to others. Good data management maximises the value of the data.