The ‘Data’ series of lunchtime talks was held in Michaelmas Term 2015 at IT Services, aimed at researchers from all disciplines who generate, gather, rearrange, or re-purpose data. The talks dealt with software, services, and techniques to help you make the most of that data. Covering data visualisation tools and projects, and aspects of data management from planning to re-use, these talks are intended to inspire, whilst also considering the practical requirements of research funders and issues surrounding data sharing. Join us to hear how others are doing things with data, both in Oxford and outside, and come along to tell us about what you are doing.
All talks took place at lunchtimes at IT Services, 7-19 Banbury Road, and all members of the University were welcome. Please follow the links below to find slides and (in some cases) videos for the sessions.
Data: Trunk to tail – linking ElEPHãTs through the Semantic Web
Kevin Page, e-Research Centre
HathiTrust offers millions of digitized titles; EEBO-TCP is a smaller well-defined collection of texts from 1473 to 1700. The Early English Print in HathiTrust (ElEPHãT) project bridges them using Linked Data as a basis for exploratory research utilizing the respective strengths of the two corpora.
‘Trunk to tail’ slides (link to PDF with Oxford single sign-on)
Video of the ElEPHãT project in action (link to a video file on Google Drive)
Data: The Yin and Yang of Data Management
Alex Hacker, Nuffield Department of Public Health
3rd week, Thursday 29 October, 12:30-13:30
Anyone who manages data will face individually reasonable requirements which nevertheless seem to conflict with each other. Alex Hacker presents four such intractable contradictions and their harmonious resolutions, illustrated with examples from his work on the China Kadoorie Biobank.
Data: Challenges in Dealing with Very Large Collections of Speech
John Coleman, Phonetics Laboratory
4th week, Thursday 5 November, 12:30-13:30
John Coleman will talk about how human languages are extraordinarily large, complex and variable and how until recently, our ability to analyse or model these has been profoundly challenged by a problem of scale. For it to become feasible to analyse its many dimensions of variation, far bigger data sources will be needed. Aggregation of existing speech corpora could be an initial step towards generating a resource that begins to be large enough to start to get to grips with some of the main sources of variation.
Data: Reproducible Research -Statistical Analysis with R using RStudio, GitHub and Shiny
Maja Založnik, Oxford Institute of Population Ageing
6th week, Thursday 19 November, 12:30-13:30
This talk will present practical, open source and integrated solutions to designing and streamlining your statistical analysis workflow using R within the RStudio environment, on the fly publishing of reports using knitr, interactive web graphics with Shiny as well as backup, collaboration and version control with GitHub.
Data: NATO Maritime Situation Awareness – Visual Analytics
Margaret Varga, Department of Oncology
9th week, Thursday 10 December, 12:30-13:30
The presentation will discuss the application of visual analytics to support maritime situation analysis: to provide analysts with increased situation awareness by helping overcome data and cognitive overload in the context of complex situations. The objective here is to detect anomalies such as illegal migrants and drug smuggling.