Set up added value services that query across platforms
Intended audience: Information professionals, technical developers
Version 0.1 October 2009 - Download PDF
A single information service usually cannot collect all the relevant materials to serve a community that is interested in a specific subject. This pathway explains how you can create services that search across more than one data collection to produce combined results. It provides some basic steps to follow to give users access to a range of collections and will present some issues that you may have to address.
What do you need to do?
The steps to take depend on the kind of collections that you want to give access to.
1. If the collections consist of websites and documents that are accessible on the Internet and likely to be indexed by search engines like Google, the community can create a specialized search engine. For example, Google provides free options to do so (i.e. Google Custom Search Engines). An example of a community that has created a custom search engine relevant to development is Focuss (http://www.focuss.info/).
2. If the information is held in different databases, there are two basic options:
a) Create a joint database of the metadata from these different services. This is often done through harvesting. An example of such a service in the development sector is AIDA (http://aida.developmentgateway.org/index.do) that brings together project registries from different development agencies. Agrifeeds (http://www.agrifeeds.org/ ) brings together feeds by harvesting newsfeeds from different services in the agricultural sector. GFIS (http://www.gfis.net/gfis/home.faces ) provides a more specialized service for the forestry sector with specific news, events, job announcements, and so on. Document-oriented services like AGRIS can use specific protocols and free tools for harvesting document metadata (OAI-PMH). Integrated services for other types of information often require significant technical development.
b) Ideally one may want to leave the metadata where it is (instead of harvesting it to a central database) but issue search requests and integrate the results to present them to the end-user as if they come from one database. The library world has been at the forefront of developments of this sort which enable end-users to integrate access to different services (both paid and free-of-charge). The CGVirtual Library is a service that uses one of these products, Metalib (http://www.exlibris.co.il/category/MetaLibOverview) to integrate access to many services in different subject areas related to agricultural research for development. The advantage of using proprietary services like Metalib (although there are others on the market, like Zportal or Encompass) is that they already have a knowledge about the different services that they integrate.
There are some important issues to be considered if one wants to create a cross-platform service:
1. If different services use different vocabularies (keywords or classifications) for the same things the results of a cross-service search will be less precise than carrying out separate searches of the different services.
2. There have been quite a few attempts to develop cross-service platforms and often the results do not meet expectations. Much effort has been invested in solving general automation problems instead of the specific problems of the development sector.
If there is not a proven technology solution available for what you want (like harvesting through OAI-PMH, or proprietary digital library products) you should consider carefully whether to proceed with a development of this sort.
Examples of implementations
Google Custom Search Engine
USA: Land Grant universities http://www.google.com/coop/cse?cx=018217748441857184963:boqsgyytwwe
Global: GFAR-Agricultural Research and Development Search engine: http://www.google.com/coop/cse?cx=011948126938199812372%3Aynkp0sbgkcy
Global: FAO cross website search. http://www.fao.org/
Japan: Shizuoka University Library for OPAC. http://www.infosta.or.jp/journal/200805e.html
Existing cross platform searches
Global: Global Forest Information Service (GFIS). http://www.gfis.net
Global: Agrifeeds – news and events. http://www.agrifeeds.org
CGIAR: CG Virtual library. http://vlibrary.cgiar.org/V?RN=208389055
Experience in developing a service
Google Custom Search Engine (www.google.com/cse)