There are a multitude of OER sites, some of which offer some services which can be used to generate aggregations. No OER site has yet reached the point where it maintains a substantial enough collection of OER, so as to render other collections spurious.
The primary issues in terms of creating aggregations from “aggregators” are as follows.
Aggregation formats supported
No standard API exists across repositories so as to facilitate a single approach to aggregation for an aggregation creator. The best available approach (and the only one viable across more than one aggregator is to use RSS Feeds).
|Site||API available||Full text search available||RSS Feed (of search results) available||All content licensed for reuse|
As such, the only aggregations that proved suitable for searching (for this project) are OER Commons, MERLOT and Xpert. Jorum’s keyword API would mean matching their keyword list to our collection, and the building a proprietary system just to parse that content. In terms of flexibility and rapid development, using RSS feeds across MERLOT, Xpert and OER Commons would prove the quickest and simplest approach.
More often than not, OER repositories and contributor sites have supplied their resources to multiple aggregators. As such, when creating aggregations across these aggregators we are likely to encounter the same resource served from different locations. OER Commons and MERLOT both use place holder URLs rather than providing a direct link to the resource like Xpert. Placeholders (e.g. http://www.oercommons.org/courses/structural-materials-in-cells/view) allow for the site to provider other services, but via the RSS feed these urls can cause problems as duplicates can be returned.
As the duplicate URL problem can only be resolved by comparing other elements of the metadata the are provided in the RSS Feed. Below are examples of the metadata from each feed (please note these are not for the same OER, but to demonstrate the differences in metadata).
OER Commons returns (per item)
<item rdf:about="http://www.oercommons.org/courses/cut-down-paper-waste-not-trees"> <title>Cut Down Paper Waste, Not Trees</title> <link> http://www.oercommons.org/courses/cut-down-paper-waste-not-trees </link> <description> Students characterize the volume of paper that the class throws away. They will decide how to reduce their paper waste, then implement their plan. Students will discover that reducing waste is the first and most important step in solving the solid waste problem. </description> <dc:creator>King County</dc:creator> <dc:creator>Solid Waste Division</dc:creator> <dc:subject>Science and Technology</dc:subject> <dc:subject>Social Sciences</dc:subject> <dc:date>2011-03-18T16:03:01</dc:date> <dc:type>Course Related Materials</dc:type> </item>
MERLOT returns (per item)
<item> <title>Tree of Life</title> <link> http://www.merlot.org/merlot/viewMaterial.htm?id=90953 </link> <description> As descibed at the site, "The Tree of Life is a project containing information about the diversity of organisms on Earth, their history, and characteristics. The information is linked together in the form of the evolutionary tree that connects all organisms to each other." </description> </item>
Xpert returns (per item)
<item> <title> <![CDATA[ Studying mammals: Life in the trees ]]> </title> <link> <![CDATA[http://openlearn.open.ac.uk/course/view.php?name=S182_8]]> </link> <guid> <![CDATA[http://openlearn.open.ac.uk/course/view.php?name=S182_8]]> </guid> <description> <![CDATA[David Attenborough looks at ‘life in the trees’: examining how species have evolved to cope with arboreal living. You will learn how lemurs, anteaters, bears and many others have developed different methods to help movement and survival.]]> </description> </item> <item>
As such we face dealing with different data sets according to those the API / RSS feed provide back. We can therefore compare titles and descriptions, but often these are different (having changed between submissions), and so a certain level of duplication is likely to be a risk in these aggregations, but an innate problem, and not due to a problem with the aggregation itself. The lack of date nodes across the metadata prevents an ability to sort content by date, and the lack of subject limits searching to the common fields (title and description).
Licensing and attribution information is also often missing from these feeds (but is possibly not supplied to the aggregator). Without overt licensing information in the feeds, then the ability of people to reuse the material is reduced. Work on the impact of OERs has shown the importance of licensing material clearly to as to encourage reuse.