Thanks to Alex Dutton for the helping with this post.
The OXCAP project is about collecting and distributing information about graduate training opportunities at The University of Oxford. This information is stored in the University’s Open Data Service (ODS) and is then used to drive a ‘course booking portal’ and to publicise the excellent training that is on offer at Oxford. (There is an earlier blog post describing how graduate training data moves around the university to make this possible.) Share Point plays a key role in this project.
Share Point is used in two distinct ways:
- as a data entry system: we have an Info Path document which is used to capture all the relevant details about a course.
- as a document repository: there is a shared document library that houses XLSX (Excel) files describing courses offered by different training providers.
The Data Entry System
We have an Info Path document library for courses which cannot be ‘automatically’ exported in an electronic format. The form is split into two: provider details and presentation details.
We ask for basic details of the unit providing the training, this includes a University-wide ‘unit identifier’ which is used as a unique key to identify the host department.
The second part of the form collects information about a single instance of the course – this may be a single presentation or a series of individual sessions.
We start with asking for basic information such as Title, Description (HTML) and a URL.
We then ask who can see and book on the course and collect booking URLs, dates and a venue.
We then ask for some sort of categorisation. JACS codes are not really applicable for graduate training so we have developed our own set of skills which are based on Vitae’s Researcher Development Framework. We have cut down on the number of categories to make the tagging process more manageable.
Finally we ask for booking details and information about the timings of individual session.
Once data has been entered we are able to use the Share Point lists web service to enumerate the XML files that represent the courses, and use XSLT to turn these in XCRI-CAP which is then stored as RDF within ODS.
The Document Repository
Data about other courses are provided to us as Excel (XLSX) spreadsheets, which can be emailed (as attachments) to a Shared Document Library within Share Point. The format of the XLSX files is outlined in a document that gives comprehensive guidance for training providers.
The XLSX files (one per training provider) are retrieved from Share Point, transformed into the Resource Description Framework by an XSLT file and then stored within ODS. (See: https://github.com/ox-it/tei-spreadsheet). (For other projects we’re investigating using Share Point lists directly, and have written a Python library and command-line tool for extracting data from Share Point, see: https://github.com/ox-it/python-sharepoint).
Once the data is stored within ODS, it can be transformed into XCRI-CAP using the same XSLT as for all other course data.
We feel that Share Point — as part of our existing infrastructure — provides a good way to maintain our data where it’s not in already-existing systems.