In a recent informal meeting involving various members of the CLARIN and other infrastructure initiatives, we had an open, frank and “off the record” discussion about successes and failures so far, and plans for the future. In preparation for the meeting, and to get the discussions going, we were asked to think of five points in response to each of three questions. I’m happy to go “on the record” with mine here!
What were your original impulses and dreams [when CLARIN planning started around 2006]?
1. To build an Arts and Humanities Data Service for Europe, on the model of the AHDS in the UK, to support digital work in the literary and linguistic subject areas, and link with similar emerging initiatives then emerging, e.g. at the CNRS in France.
2. To promote and integrate Central and East European researchers, resources and languages, continuing the work of TELRI project in the previous period.
3. To build new European networks, built on transparency, openness and a real desire to engage with, support and improve research, to replace failed European initiatives which were sometimes built on careerism, cronyism and corruption.
4. To move the focus of language resource & tool creators (especially computational linguists) towards the requirements of Humanities researchers, making it easier for users with little technical support to do simple yet powerful things with key resources.
5. To facilitate the participation of literary and linguistic disciplines in the emerging e-Science agenda.
What are the most important successes and failures so far?
1. Success: the initiative is almost pan-European, although some key countries not involved or not fully integrated (UK, Italy), and a very few not involved at all (Ireland, Switzerland); the integration of former TELRI partners from central and eastern Europe was successfully achieved.
2. Success: we have succeeded in getting enough funding from national funders to make CLARIN happen!
3. Partial failure: we’ve only had fairly small-scale engagement so far of scholars to elicit detailed requirements and to develop use cases.
4. Partial failure: we haven’t made the total shift of focus of the CLARIN community away from traditional concerns (own tools and research) to production infrastructure services for the humanities and social sciences.
5. Partial failure: we have not yet created a standards-oriented ecosystem for resource and tool creators to enable them to contribute to sustainable production services. To put it another way, there is still not yet a simple answer to the question “How do make CLARIN-conformant resources?” I hope that the forthcoming Reference Manual will at least partially solve this problem.
What are the top priorities for future work?
1. We need to work out ways to lobby for and secure funding, in a situation where, in the Humanities, there is a lack of a critical mass of researchers (in any given discipline) who want research computing infrastructure, or who see it as a top priority. This means that here is a lack of an effective lobby group of influential scholars in most forums. This is one of the disadvantages of the cross-disciplinary nature of linguistics and the language resources and tools field.
2. We need to deliver something urgently to show the relevant communities that we can do it, and to give them a clearer idea of what he intend to do. Access and authentication infrastructure (AAI) is the key to delivering any kind of production service which can show and end-to-end use case, so we should make solutions in this area a logical priority.
3. Where is the data processing going to take place, who is going to pay for it, and how will we do the accounting? We urgently need to make progress towards solutions here as well if we are to create production-quality services.
4. Humanities and social sciences research has global connections. How will we accommodate users and service providers outside of our AAI domain? As CLARIN starts to rely on national funding, there is an increased danger of two-speed progress, with some countries and communities who are currently engaged being pushed out.
5. What will the platforms for users be, and who is going to make the user interfaces? Are we going to be able to overcome fragmentation and ‘silo-building’ – can we offer a good user experience while still allowing flexibility and connectivity? If so, how, and when?