Researchers deal with data, this could be ancient texts, excel spreadsheets, word documents, databases, web pages, statistics, images, or reports from equipment, but usually researchers are using data of some sort. In creating new information of this type researchers might need assistance in knowing how to structure the data and its relationships. This may take the form of a database, an XML schema, or a set of relationships between files or other resources. Helping to understand the requirements for this data and documenting it is data modelling. In other cases researchers may already understand their data and the relationships it has, but need to change this structure in order to enable other uses of it. This may be to import it into a different piece of software or to update its format to the latest standards without loss of content, this is data migration. In addition to providing specialist advice on research data management, the Research Support team in Academic IT Services can provide specialist support in both data modelling and the migration and conversion of research data.
We can advise on how to:
- organise your data and its relationships
- design a database
- select open international standards for your data
- choose appropriate metadata vocabularies
- find possible tools for data conversion
- avoid potential pitfalls in data structures
- fix bugs in your existing data transformations
We can help by:
- discussing your data with you
- creating and documenting a data model
- reorganising the structure of your data
- de-duplicating records in large datasets
- cleaning data based on a set of defined rules
- converting your data into different formats
- LEAP (Livingstone-online Enrichment and Access Project) is an NEH Humanities Collections and Reference Resources Grant to support ongoing enrichment of and access to Livingstone Online. We not only helped with migration of their resources but created a TEI P5 ODD customisation for their legacy and future materials which both documents their use of the TEI Guidelines and constrains the options available for their encoders. http://blogs.it.ox.ac.uk/jamesc/projects/leap/
- The REED project, an international organisation centred at the University of Toronto, transcribes, edits, and publishes Records of Early English Drama. We are helping them understand their existing data model (both print and digital) and creating a TEI ODD customisation file which acts as a meta-schema to document this data model. http://blogs.it.ox.ac.uk/jamesc/projects/digital-reed-schema/
Data Migration and Conversion
- The Text Creation Partnership is an international consortium producing full-text transcriptions of early texts. The Research Support team played a crucial role in transforming these from TCP markup to the agreed TEI P5 XML standard and making them publicly available both on the Oxford Text Archive’s TCP Collection as well as the underlying source files in github.
- The DIVA project run by David Zeitlyn in School of Anthropology and Museum Ethnography is looking at academic lineages and wanted to convert the DIVA dataset, in MODS XML format, to a set of interlinked relational database tables. http://blogs.it.ox.ac.uk/jamesc/projects/diva/
- The Domesday Text pilot project took bespoke legacy XML created by David Roffe and converted it to TEI P5 XML (and from there to basic HTML) as part of a demonstration of the possibilities of this project. http://blogs.it.ox.ac.uk/jamesc/projects/domesday-text-pilot/
- Data collected by the Determinants of International Migration project was collected in separate spreadsheets by country. Research support helped convert the data into a single normalised, relational database and imported it into ORDS, our research database hosting service.
- The Professions in Nineteenth Century Britain project needed to collect 2 sets of data offline and have them combined into a single online database in ORDS. Research support advised on the approach for uniquely keying records, and assisted in combining the data into a single database.
If you have problems with your research data and you want help structuring, migrating, or converting it then get in touch. The data modelling and migration specialism in the Academic IT Research Support Team is provided by Dr James Cummings and others, contact them directly or via firstname.lastname@example.org with questions or to set up an initial meeting.