The Sudamih Project has run its course and the final report is now available for your perusal. Although we’ve been rather quiet on the blog front in recent months this reflects the amount of work we’ve been doing on the project rather than the lack of it. Since the start of the year the project has written two full three-hour face-to-face courses and taught them to a varied class of humanities scholars, we’ve added a whole suite of data management guidance and tips to Oxford University’s Research Skills Toolkit, and progress on the Database as a Service has moved on in leaps and bounds. We’ve also had a go at enumerating the many benefits of the work we’ve done (which is relatively straightforward), calculating the ongoing costs of the training (again, not too difficult), and putting together a business case considering the future returns on investment (much trickier). All this is detailed in the final report, along with various lessons learnt and conclusions, both for JISC and for anyone thinking of establishing similar infrastructure at other universities. You can find the project outputs, including de-Oxfordized versions of the training materials designed to be re-used at other institutions, at the Sudamih Project Outputs webpage.
Some key findings include:
- The intellectual value of humanities datasets tends not to depreciate over time.
- There is a need in the humanities for very-long-term data sustainability solutions and cost models designed to deal with effectively permanent storage and access.
- Most researchers are willing in principle to share their data with others, but in practice do not regularly do so, for a variety of reasons. In the humanities, issues surrounding the incompleteness of the original data, or the layer of interpretation often required to render it consistent, can lead to reluctance to share, as researchers worry that their ‘processed’ data may be misinterpreted by others.
- Researchers need help to discover the most appropriate software tools to deal with specific research challenges.
- Researchers should be trained in organizational principles and strategies to enable them to better manage their information and sources.
- Researchers do not understand the terminology used by data librarians. Care must be taken to avoid technical jargon and use unambiguous but straightforward terminology when talking about data management.
- There is a significant amount of confusion over the ownership of research data. This is exacerbated by complex situations in which multiple people or organizations may have different claims on the same resource.
- Different academic departments and institutional service providers should work together to understand who should be responsible for implementing, and sustaining, various aspects of data management training.
- Data management training can have a large positive impact in terms of long-term cost savings relative to the near-term costs of running and maintaining courses and learning materials.
Although Sudamih is now at an end, our efforts to develop a research data management infrastructure at Oxford are still very much ongoing. Out next task is to take the pilot Database as a Service and turn it into a full production service, with polished intuitive interfaces, secure storage, a user manual, and accompanying training materials. We are also transforming the DaaS into a service that can be provided via the cloud, maximising its cost efficiency. The new project is called VIDaaS (Virtual Infrastructure with Database as a Service), and you can find out more about it here.