Open Archives Initiative Protocol for Metadata Harvesting
The Open Archives InitiativeProtocol for Metadata Harvesting is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations. The protocol is usually just referred to as the OAI Protocol. OAI-PMH uses XML over HTTP. Version 2.0 of the protocol was released in 2002; the document was last updated in 2015. It has a Creative Commons license BY-SA.
History
In the late 1990s, Herbert Van de Sompel was working with researchers and librarians at Los Alamos National Laboratory and called a meeting to address difficulties related to interoperability issues of e-print servers and digital repositories. The meeting was held in Santa Fe, New Mexico, in October 1999. A key development from the meeting was the definition of an interface that permitted e-print servers to expose metadata for the papers it held in a structured fashion so other repositories could identify and copy papers of interest with each other. This interface/protocol was named the "Santa Fe Convention". Several workshops were held in 2000 at the ACM Digital Libraries conference and elsewhere to share the ideas from the Santa Fe Convention. It was discovered at the workshops that the problems faced by the e-print community were also shared by libraries, museums, journal publishers, and others who needed to share distributed resources. To address these needs, the Coalition for Networked Information and the Digital Library Federation provided funding to establish an Open Archives Initiative secretariat managed by Herbert Van de Sompel and Carl Lagoze. The OAI held a meeting at Cornell University in September 2000 to improve the interface developed at the Santa Fe Convention. The specifications were refined over e-mail. OAI-PMH version 1.0 was introduced to the public in January 2001 at a workshop in Washington D.C., and another in February in Berlin, Germany. Subsequent modifications to the XML standard by the W3C required making minor modifications to OAI-PMH resulting in version 1.1. The current version, 2.0, was released in June 2002. It contained several technical changes and enhancements and is not backward compatible.
Uses
Some commercial search engines use OAI-PMH to acquire more resources. Google initially included support for OAI-PMH when launching sitemaps, however decided to support only the standard XML Sitemaps format in May 2008. In 2004, Yahoo! acquired content from OAIster that was obtained through metadata harvesting with OAI-PMH. Wikimedia uses an OAI-PMH repository to provide feeds of Wikipedia and related site updates for search engines and other bulk analysis/republishing endeavors. Especially when dealing with thousands of files being harvested every day, OAI-PMH can help in reducing the network traffic and other resource usage by doing incremental harvesting. NASA's Mercury metadata search system uses OAI-PMH to index thousands of metadata records from Global Change Master Directory every day. The mod_oai project is using OAI-PMH to expose content to web crawlers that is accessible from Apache Web servers.