The University of Oxford Research Data Management Survey of 2012 is now closed to new respondents and its findings are ready for analysis. The response to the survey has exceeded all expectations, with 314 Oxford researchers taking the time to complete it. This has been largely thanks to its assiduous promotion by Dr. Meriel Patrick, the IT Learning Programme, the University’s Research Services, and the many departmental administrators who circulated it to their respective mailing lists. As a result, we have a good spread of disciplinary coverage, and a rich seam of data to mine.
The survey asked about the data that researchers generate, how they organize it, their attitudes to sharing it, and whether they have been inspired to undertake new research after looking at data that others have shared in the past. It also served to benchmark awareness of centrally-provided infrastructure and inform service planning for the future.
25% of responses came from researchers working in the Humanities Division, 25% from those in the Mathematical, Physical and Life Sciences, 29% from the Medical Sciences, and 20% from the Social Sciences (a few respondents did not identify themselves with any of the four academic divisions).
Some disciplinary differences were predictable. Over 80% of researchers in the Humanities, for instance, conducted their research (and managed their data) as individuals. In the sciences it was far more common to work as part of a team. Those in the Medical Sciences were least likely to conduct research as individuals (just 10% of respondents) and most likely to manage their data as a team (34%). Even in the sciences, however, it was not uncommon for researchers working as part of a team to take personal responsibility for looking after their own data, with more researchers working in this manner in the Mathematical, Physical, and Life Sciences than were working in teams where data was looked after by the group. This perhaps goes some way to explain the sense of personal ownership that many researchers feel towards their data.
61% of survey respondents worked with textual data, 65% with numerical data, and 62% with statistical data. Perhaps surprisingly, more specialized data types were also quite widely collected. 43% of respondents had image data, 15% had audio data, and 14% worked with geospatial data.
76% of respondents used spreadsheets to store at least some of their data, whilst 32% used relational databases. Other forms of database were also quite widely used, with 23% of respondents reporting that they used ‘document’ databases, or other unstructured types of database. With my academic background in the Humanities, I was somewhat surprised to find that a greater proportion of respondents from the Mathematics, Physical and Life Sciences used XML mark-up than their more literary colleagues (16% versus 6%).
Encouragingly, 64% of respondents said they regarded research data management as ‘essential’ to their research (other options were ‘important’ (28%), ‘helpful up to a point’ (8%), and ‘not important’ (just one response)).
The benefits of sharing data were also highlighted by the survey, which revealed that 37% of respondents had been inspired to undertake new or additional research as a result of looking at data that had been shared in the past. Another 22% reported that looking at past shared data had been a factor in their decision to undertake new or additional research.
Despite this, attitudes towards sharing one’s own research data remain complex. 30% of respondents said that they would be happy to share all or most of their data without any restrictions (after an appropriate embargo period), but it was also clear that legal, ethical, and commercial concerns posed a significant obstacle to data sharing for many research projects. 24% of respondents reported that they would be happy to share all or most of their data provided that a written explanation of how the data would be used were provided, whilst 26% said that they would be happy to share all or most of their data only with colleagues or collaborators. In terms of disciplinary differences, those in the Mathematical, Physical and Life Sciences were the most willing and open with regards to data sharing, with 56% saying that they would share all or most of their data. Researchers in the Medical Sciences were, perhaps unsurprisingly, the most reluctant, with only 9% reporting this. This appears to be largely due to the comparatively greater proportion of researchers in the Medical Sciences whose ability to share data is restricted by ethical or privacy concerns.
Awareness of the data management requirements of the major funding bodies was relatively low, with 44% of respondents saying that they were not aware of whether the major research funder(s) in their field expected them to provide information in funding bids about how they would manage, preserve, and/or share the research data that they would create during the course of their research. This figure may not be entirely representative, however, as some survey respondents were relatively junior and had not been involved in writing the funding bid for their current research (32% of respondents working on externally funded research had not been involved in preparing the bid).
Responses relating to the University of Oxford’s existing data management infrastructure indicated that awareness was generally disappointing. This highlights the need for Universities to promote their own services to their researchers, many of whom pay more attention to developments in their discipline rather than in their institution. Encouragingly, however, almost 1 in 4 had at least heard something about research data management from the University, despite it being a subject that we have only recently started trying to ‘embed’ within the institution. Also, more than 20% of researchers were aware of the new University Policy on Research Data Management, which has not yet been the target of any concerted promotion or publicity efforts. Hopefully, if we can run the survey again in subsequent years, we will see a healthy increase in awareness.
The anonymized summary results of the survey are now available from http://damaro.oucs.ox.ac.uk/docs/OxfordRDMsurvey2012_public.xlsx. The anonymized raw survey data can be requested via emailing firstname.lastname@example.org.