SPINDLE: Increasing OER discoverability by improved keyword metadata via automatic speech to text transcription.
A summary of the project using the words of the voice-over that accompanies the SPINDLE overview video that documents the project.
1. Aim – Generate keywords automatically from recorded lectures
2. Spindle was funded by JISC through the “Open Educational Resources – Rapid Innovation” strand. – http://www.jisc.ac.uk/whatwedo/programmes/ukoer3/rapidinnovation.aspx
3. Spindle was a technical project whose key objective was to explore generating cataloguing keywords from recorded lectures.
4. Spindle reviewed the accuracy of “speech to text” tools available to media producers for automatically generating a text transcript from a recording file.
5. Spindle created a program that automatically filters the uncorrected transcript to a set of statistically interesting keywords. The program analyses the lecturer’s words and compares them with the British National Corpus of Spoken Words.
Better keywords improve the discoverability of open content !
6. Spindle went on much further than expected than the initial plan to create a “captioning” toolset to help media producers deal with cataloguing media
With this toolkit, a media service can now:
– batch process recordings to create transcripts automatically ( using the free toolset CMU Sphinx)
– generate keywords
– correct any transcript errors while listening to the media
– and export into time-coded captioning and archive formats
7. The Spindle captioning toolset was written in Python using the DJANGO framework
8. The Spindle code is publicly available to re-use in an online repository under an open source licence – [ Github code repository – https://github.com/ox-it/spindle-code hashtag #spindle #OERRI ]
9. All reports and further information are available through the Spindle blog – http://blogs.it.ox.ac.uk/openspires/category/spindle – hashtag #spindle
Watch the SPINDLE 2 minute overview video using the above text as the voice-over at: