Last week I attended a face to face meeting of almost all the staff now working directly on the ADONIS project, a fairly unusual event (I think this was the first one in the current development cycle) since the project now has many people working in different locations around France. Probably as a punishment for typing too noisily during the meeting, I was asked to draft a brief report on the meeting for the ADONIS website, which seems a good pretext for me to write up a blog posting here.
ADONIS is a TGE (Très Grand Equipement) directly financed by the Ministry of Higher Education (MESR) and reporting to the CNRS, the National Research Council. (Inevitably these two bodies occasionally have different points of view and priorities, which makes life, shall we say, interesting). ADONIS is so far the only TGE to have been specifically charged with responsibility for infrastructural support of the humanities and social sciences (SHS) and has defined an ambitious programme of work, currently underway. The TGE is co-ordinated by a small team based in Paris but is highly distributed, with most of its key activities and services being run by people attached to other labs and service organizations around the country, some of whom had never met before this meeting, though they feature on the official staff list.
After a set of mutual introductions around the table, Yannick Maignien (Director) and Richard Walter (Assistant Director) began by sketching out the TGE’s overall structure and objectives, placing them in the international and national contexts respectively. The mission of ADONIS is to improve the quality and efficiency of French research in SHS, by facilitating better access to shared resources, and promoting best practice; in defining and implementing the key infrastructural resources and services needed for the humanities research environment. These include, for example, provision of archival resources; development of an intelligent search engine for existing digital resources; training on the use of specific relevant technologies; promotion of open access and other digital publishing methods.
Administratively speaking, ADONIS is a “Unité Propre de Service” (Specific Service Unit) attached to the CNRS, and funded like others such on a four year rolling programme. It has a Steering Committee (comité de pilotage), chaired by Michel Spiro, with representatives of the Ministry and the CNRS Institute for Human and Social Sciences, and other interested parties. It also has an Advisory Board (comité scientifique) with a dozen or so distinguished members (e.g. Simon Hodson from JISC, Stephan Gradman from Humboldt University, Francoise Genova from the Strassbourg Observatory).
Since 2005, the CNRS has also funded a number of centres de ressources (National Resource Centres). These are subject-specific centres attached to one or more existing labos (research units) and charged with the task of sharing their expertise and their services with other research units. In the hard sciences, typically, research is carried out at one of a small number of large labos; in the Humanities by contrast, there is a very large number of small units. Hence the importance of developing and sharing shared solutions to common problems, and the important role that ADONIS has with regard to the centres. Just to make life even more interesting, there are other relevant national organizations, notably the CNRS network of professional staff (collectively known in the CNRS as ITA: Ingénieurs, Techniciens, Administrateurs), and the completely different regional network of Maisons des Sciences de l’Homme which provide support services to the Universities, rather than via the CNRS.
Stephane Pouyllau, the third member of the central ADONIS team to speak, gave an overview of the Digital Humanities (sciences numériques) à la française, a set of topics which overlaps significantly but not entirely with the concerns and activities promoted under that badge elsewhere in the world. In France, it is preservation of, and access to digital resources of all kinds which are the major concerns and which constitute the “digital turn” (le tournant du numérique); the major threats are seen to be such things as loss of data, dispersion of expertise, duplication of effort, solutions which do not scale, lack of international visibility and recognition. The need for skills in helping non-technical experts gain confidence in technical areas is largely unrecognized by existing professional training for computing support staff, with consequent problems of communication. And of course, in France the word “science” includes the study of the Humanities as well as the study of “les sciences dures“.
ADONIS aims to address these problems in several ways. It will provide a socle de services (core set of services), including such key activities as archival services and cataloguing of resources needed by many labos, and, from the end of this year, it will also be offering a sophisticated search engine called Isidore. Isidore (Integrated Service for Indexing the Data of Research and Education — or something like that) will crawl and index a wide variety of existing data sources, analysing a variety of standard metadata formats (OAI-PMH, RSS, Sitemap, Z39-50…) to access and merge information into a unique RDF system with its own SPARQL endpoint. Although this may entail working closely with service providers, the need to involve them in the project should mean that the quality, relevance, and accuracy of data provided will be much higher than is currently available from (e.g.) Google.
Returning to the topic of the National Resource Centres, Richard gave a brief overview of the activities and responsibilities of each, as currently configured, noting in passing that the ANR (Agence National de Recherche; the main French research funding agency) had funded dozens of digital data creation projects in the past but had no policies in place to ensure their preservation or their continued access. The skill sets available within the five existing centres include archaeology and 3-D modelling, modelling in social science, iconographic and visual indexing, and linguistic and terminological analysis, as well as appropriate technologies for the digitization and encoding of manuscript or spoken materials.
Stephane then presented the recently completed pilot project on long term archiving, for which the CRDO (Centre de ressources pour la description de l’oral: the National Resource Centre concerned with spoken data) had served as guinea pig. Financed by ADONIS, this project had involved the CRDO as source of the original data and manager of access to its archived form, and also the CINES (National Computing Centre for Higher Education) which had managed the whole archival process, conformant to Open Archival Information System norms, and the CC-IN2P3 (Computing Centre of the National Institute for Nuclear and Particle Physics) which had provided the computing resources — a FEDORA database– for the resulting system. The project had thus demonstrated the viability of ADONIS’ distributed approach to the resourcing of such services, and shown the value of an “e-Research” mode of operation: the storage facilities of CC-IN2P3 are able to preserve the results of fieldwork on surviving old French dialects just as well as the experimental data resulting from nuclear reactions, while the CINES’ expertise in the international standard methodology for creating and managing long term meta data is equally applicable to either.
Thus encouraged, we broke for a pizza lunch (see photo) at Casa valentino, rue St Jacques.
After lunch, we heard more technical detail about the archival experiment from Pierre-Yves Jalud, who is responsible for managing the project at CC-IN2P3. Pierre-Yves had of necessity become expert in the use of Fedora Commonsas a repository management system; there was some discussion as to the merits of this open source solution as opposed to dSpace, dSpace, which seems to be its closest rival. Pierre-Yves noted that the latter did not support iRods iRODS, the de facto software platform for grid applications. He also cited Carl Lagoze’s article from 2006 as a foundational text for the definition of what a digital library should be.
His colleague Huân Thebault spoke in more detail about the authentication and authorisation solutions adopted for the project. RENATER (the French equivalent of JANET) already provides a Shibboleth-based national network of trust, so that the credentials from any RENATER site can be used to log in to any of the others, including the CCN2P3. However, this clearly needs to be complemented by something else for users coming from outside RENATER. The solution adopted is a hybrid architecture combining Shibboleth with openSSO, which is used to authenticate “foreign” users. The drawback, from some points of view, is that they have to maintain their own LDAP directory to hold authorisation data — but they would have to do this in any case.
Jean-Baptiste Génicot described some of the technical problems behind implementing an efficient Z39-50 based access where the various repositories concerned have wildly varying notions both concerning what data items should be made available, and what technical infrastructure should be used to deliver them. His solutions used the existing SOAP library for php to handle data delivered via WSDL, derived from legacy formats such as BIBLIOML and MARCXML in MODS3.3.
The CCSD (Centre for Direct Scientific Communication) is one of the key bibliographic service providers in France, responsible amongst many other things for HAL: HyperArticles Online, the major open access archive of French research papers and resources. It also hosts another ADONIS-supported project, the Open Archive for Photographs and Scientific Images. Philippe Correia and Loïc Comparet reported on a new service under development there called Scienceconfs which (when it opens) will provide a full range of conference management services, from announcements (already well served by e.g. calenda.org) through reviewing and programme planning to proceedings manufacture. The project is still under development and there is good scope for consultation and review.
The CLEO (Centre for open electronic publishing) also supported by ADONIS has established itself as a key academic publisher in France and beyond, with several hundred journal titles published through its portal revues.org and a raft of other complementary services. It was represented at our meeting by Andréa Pirastru, a new recruit, who talked about his experience in working with Drupal, the content management system on which revues.org depends.
As I noted above, ADONIS also has an important role outside the hexagon, as the sole French contributor to the European Research Roadmap for infrastructural support in the Humanities. Britta Moehring described some current activities in this connexion: notably the elaboration of a business plan and organizational structure for DARIAH, the EU-funded project which is supposed to be defining a Digital Research Infrastructure for the Arts and Humanities at a European level. The model that is emerging is, appropriately, a highly distributed one, in which a number of essential “competences” are identified, and then provided by possibly many partners acting in collaboration. This is being worked out with other DARIAH partners, notably the Max Planck Institute, the University of Göttingen, King’s College London, and DANS (Data Archiving and Networked Services), the Dutch project leaders. Further collaboration is to be anticipated with other key infrastructural initiatives, such as CLARIN (in which OUCS is also represented by the way), and CESSDA
ADONIS is well placed to do this work, since it faces exactly the same issues at the local (i.e. national) level. No single institution can provide all the components of an infrastructure of the kind needed, whether for financial or skill set limitations; organization and maintenance of productive partnerships and distribution of specialist services seem the only way forward.
My own brief intervention came at the end of a long day, so I kept it short. I commented on “internationalisation” aspects and activities of ADONIS, some of which I have already touched on in this report. As well as DARIAH, and knowledge transfer with the non-francophone Digital Humanities community, I suggested that work with the Text Encoding Initiativewas also an important component of the project (well, I would, wouldn’t I). I described briefly the TEI Demonstrator project in which we are collaborating closely with the Max Planck Digital Library and noted its synergy with the development of the Isidore platform. But mostly I showed the following nice picture of a Virtual Research Environment, as envisaged in France at the start of the 20th century. ADONIS is the box on the right, and its team is the hard working boy turning the handle.