As part of the Sudamih Project’s requirements gathering exercise, we asked a group of researchers to complete a questionnaire based on the Data Audit Framework.
The Data Audit Framework (which is in the process of being renamed as the Data Asset Framework) was developed by HATII at the University of Glasgow in association with the Digital Curation Centre. It’s intended as a tool to help higher education institutions take an inventory of their research data assets, with a view to ensuring effective preservation and accessibility. Within the confines of the Sudamih Project, a complete data audit was impractical, so instead, our questionnaire was based on just the third stage of the DAF methodology, designed to provide detailed information about individual data assets.
Although we were working with a small sample, we nevertheless got some interesting results. The questionnaire answers highlighted two features common to many humanities datasets. First, they are frequently almost infinitely expandable, and secondly, the data is rarely of a type which goes out of date.
This means that there is huge potential for the reuse of humanities datasets, both by the researchers who created them and (if the creators are willing to make the data public) by others: a database compiled for one project may often form a useful starting point for another in a similar area. This in turn emphasizes the need for stable long-term curation infrastructure for humanities data.
A second area of interest was the cost of producing datasets. A number of our respondents noted that the chief expense in their project had been their own time, but were generally wary of putting a specific price tag on this. The issue is complicated by the fact that a humanities research project may produce both data assets and a book or thesis, and it is often hard to say how the total costs of the project should be apportioned. However, the answers gave a general impression that some humanities scholars may be inclined to undervalue their own time, and hence perhaps the data resources they are producing – despite the potential for long-term usefulness of humanities data assets noted above.
These findings will feed into the next stages of the Sudamih Project, as we begin to think in more detail about the provision of a database service and training for researchers.
A full report on the use of the DAF within the Sudamih Project is available from the project outputs page of the Sudamih website.