info URIs & OpenURL Applications for Identifier Resolution Approach towards Identifiers & Resolution in the LANL adore repository Research Library Los Alamos National Laboratory, USA
info URI & OpenURL info URI I-D (http://www.ietf.org/internet-drafts/draft-vandesompel-info-uri-04.txt): No resolution mechanisms can be assumed for the "info" URI scheme, though for any particular namespace there MAY exist mechanisms for resolving identifiers to network services. The definition of such services falls outside the scope of the "info" URI scheme. ANSI/NISO Z39.88-2004 OpenURL Framework Standard (http://www.niso.org/standards/resources/z39_88_2004.pdf&std_id=783): An OpenURL Application is a networked service environment in which packages of information are transported over the network. These descriptions have a description of a referenced resource at their core, and they are transported with the intent of obtaining context-sensitive services pertaining to the referenced resource. Combine info URI & OpenURL Application to create an identifier resolution environment Explain in the context of the LANL adore repository
the LANL adore repository LANL Research Library aggregates content from primary and secondary scholarly publishers. And looks into institutional content. Launched the adore repository effort to: Create an infrastructure to store that content Store content in a manner that: o scales to hundreds of millions of (relatively static) objects o disconnects content storage from content services o keeps digital preservation issues in mind adore is a modular repository architecture: Standards-based Highly distributed architecture (for example: contains ~ 500 autonomous repositories today) Protocol-based interactions between modules
the LANL adore repository Standards Distributed architecture Protocol-based communication adore is a simulation of the Web-based, distributed repository environment
sample Digital Object in adore Type MIME identifier Digital Object scholarly paper N/A DOI Constituent Datastream 1 metadata record application/xml PMID Constituent Datastream 2 fulltext file application/pdf
sample Digital Object packaged as MPEG-21 DIDL XML document OAIS PACKAGE PERSPECTIVE OAIS CONTENT PERSPECTIVE <DIDL> DIDid="info:lanl-repo/i/58f202ac" <Item> ID="uuid-00005e90" <Item> ID="uuid-888b135e" <Component> ID="uuid-0000a01c" item info:doi/10.123/44455 item info:pmid/2225887 component Package Digital Object <Component> ID="uuid-00004a42" component new version => new package
sample Digital Object packaged as MPEG-21 DIDL XML document OAIS PACKAGE PERSPECTIVE <DIDL> DIDid="info:lanl-repo/i/58f202ac" <Item> ID="uuid-00005e90" <Item> ID="uuid-888b135e" <Component> ID="uuid-0000a01c" OAIS CONTENT PERSPECTIVE OAIS Content Identifiers item info:doi/10.123/44455 item info:pmid/2225887 component <Component> ID="uuid-00004a42" component OAIS Package Identifier ~ OAIS Fragment Identifiers
Identifiers in the LANL adore repository Package identifiers: Minted at ingestion time info:lanl-repo/i/uuid Content identifiers: Typically inherited from publisher. If not then assigned. If identifiers are URIs: use them (i.e. http address in case of Web crawling) If identifiers are not URIs: express as URIs via info URI scheme o if namespace is registered under info URI, use it: info:doi/. o if namespace is not registered under info URI, bring it under info: info:lanl-repo/biosis/
Identifiers in the LANL adore repository Scale: Currently ~ 150,000,000. adore components designed to scale to deal with ~ 600,000,000 identifiers by end of 2008 Infastructure: Identifier Locator stores: Mapping between Content Identifiers and Package Identifiers Location of identified package in the repository Identifier Locator currrently implemented using in-house technology Identifier Locator will be implemented using handle technology (not using handles)
Identifiers in the LANL adore repository Properties: URIs Not actionable (even the ones that are natively actionable are considered non-actionable once in the repository) No built in resolution mechanism Become servicable through OpenURL Applications and via OpenURL Resolver front-end to the adore repository
OpenURL Framework networked resource Resolver Transport reference about Referent description of Referent & context ContextObject services pertaining to Referent
OpenURL Framework Namespaces of Identifiers Descriptors Metadata Formats ContextObject Referent Resolver Requester ReferringEntity Referrer ServiceType ContextObject Format ContextObject representation Resolver Transport reference service...s
OpenURL Resolver in the LANL adore repository Two types of OpenURL Applications: Services at the level of OAIS Packages: Request of OAIS DIP (Dissemination Information Packages) in various formats, i.e. DIP in MPEG-21 DIDL DIP in METS DIP in IMS/CP Services at the level of the OAIS Content: Request of disseminations of stored datastreams and transformations thereof, i.e.: Disseminate descriptive metadata as stored (MARCXML) Disseminate descriptive metadata in transformed format (DC, MODS, publisher native, ) Disseminate Word doc as stored Disseminate Word doc transformed to pdf Both work with Package Identifiers and Content Identifiers
adore OpenURL Resolver: OAIS Package level OpenURL ContextObject OpenURL baseurl_adore Referent: Identifier Descriptor : Content Identifier Metadata Descriptor : optional Package Identifier (version indicator; if not provided use most recent package) ServiceType: Identifier Descriptor : Identifier indicating request for list of services to request various DIP formats for Referent, i.e. info:pathways/svc/dip
adore OpenURL Resolver: OAIS Package level OpenURL baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & svc_id=info:pathways/svc/dip baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & svc_id=info:pathways/svc/dip.didl baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & svc_id=info:pathways/svc/dip.mets List of OpenURL requests in KEV Container of ContextObjects in XML baseurl_adore baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & svc_id=info:pathways/svc/dip.ims
adore OpenURL Resolver: OAIS Package level OpenURL baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & svc_id=info:pathways/svc/dip.didl OAIS PACKAGE PERSPECTIVE OAIS CONTENT PERSPECTIVE <DIDL> DIDid="info:lanl-repo/i/58f202ac" <Item> ID="uuid-00005e90" item info:doi/10.123/44455 <Item> ID="uuid-888b135e" <Component> item info:pmid/2225887 component ID="uuid-0000a01c" DIDL XML document <Component> component ID="uuid-00004a42" baseurl_adore
adore OpenURL Resolver: OAIS Content level OpenURL ContextObject OpenURL baseurl_adore Referent: Identifier Descriptor : Content Identifier Metadata Descriptor : optional Package Identifier (version indicator; if not provided use most recent package); Fragment Identifier (to identify bitstream) ServiceType: Identifier Descriptor : Identifier indicating request for list of services pertaining to Referent content, i.e. info:pathways/svc/bootstrap
adore OpenURL Resolver: OAIS Content level OpenURL baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & svc_id=info:pathways/svc/bootstrap baseurl_adore baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & rft.arg=fragmentidentifier & svc_id=info:lanl-repo/svc/marc2mods baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & rft.arg=fragmentidentifier & svc_id=info:lanl-repo/svc/word2pdf List of OpenURL requests in KEV Container of ContextObjects in XML
adore OpenURL Resolver: OAIS Content level OpenURL baseurl_adore? url_ver=z39.88-2004 & rft_id=contentidentifier & rft.arg=fragmentidentifier & svc_id=info:lanl-repo/svc/word2pdf baseurl_adore
Lesson Learned NISO OpenURL Standard provides framework for the definition of OpenURL Applications in the realm of identifier resolution: o o Identifier resolution ~ delivery of services pertaining to identified object OpenURL Standard is based on abstract definitions: same OpenURL Application can be instantiated using different technologies as they evolve (KEV ContextObject & HTTP Transport ; XML ContextObject & SOAP Transport ; ) => Persistent resolution environment
OpenURL Abstractions for Identifier Resolution Step 1: Introspection Abstractions object identifier service identifier (request list of services) OpenURL ContextObject Resolution Interface OpenURL Transport list of services Container of OpenURL ContextObjects
Lesson Learned NISO OpenURL Standard provides framework for the definition of OpenURL Applications in the realm of identifier resolution: o Identifier resolution ~ delivery of services pertaining to identified object o OpenURL Standard is based on abstract definitions: same OpenURL Application can be instantiated using different technologies as they evolve (KEV ContextObject & HTTP Transport ; XML ContextObject & SOAP Transport ; ) => Persistent resolution environment o OpenURL Standard allows for inclusion of other entities in the ContextObject (Requester, Referrer, ) => potential for context-sensitive identifier resolution environment o The resulting resolution mechanism is independent of the nature of the identifier (whichever URI can be Referent Identifier) => Put an http URI as Referent Identifier on an OpenURL Application service request in 2020 ;-) Use of OpenURL Application as standard-based Obtain interface for repositories: thoroughly researched in Jeroen Bekaert s PhD thesis Use of OpenURL Application as the basis of long-term resolution abstraction: under investigation