Caltech Library logo

EPrints extended API

The EPrints software package from University of Southampton provides a rich internal Perl API along with a RESTful web API. The latter has been used extensively by Caltech Library to facilitate content reuse across campus for our various EPrints repositories. The challenge now is to move beyond the present limitations. (See priorities two and three of the AY 22 Caltech Library’s strategic plan)

Extending EPrints directly is error prone and cumbersome. Implementing features in Perl safely is only the start of trouble if we modify EPrints directly. In contrast EPrints’ MySQL database structure has proven to be durable and predictable. MySQL can be leverage directly to extended API seeks to beyond our current constraints.

What should an extended web API look like?

Design considerations

An extended API should provide a limited web service that maps URL end points to simple MySQL queries run against the various EPrints databases. The service should be easy to implement require minimal resources, e.g. one prepared SQL statement per end point.

Security and privacy should be front and center when implementing any web service. By returning EPrint ID only we limit the risk of exposing in appropriate metadata (e.g. author information). The EPrint ID is an integer without specific meaning. It does not give you access to sensitive information.

Unique IDs to EPrint IDs

The following URL end points are intended to take one unique identifier and map that to one or more EPrint IDs. This can be done because each unique ID targeted can be identified by querying a single table in EPrints. In addition the scan can return the complete results since all EPrint IDs are integers and returning all EPrint IDs in any of our repositories is sufficiently small to be returned in a single HTTP request.

Change Events

The follow API end points would facilitate faster updates to our feeds platform as well as allow us to create a separate public view of our EPrint repository content.

Nice to have end points

The following end points would be nice to have but they would either requirecustomization of our existing EPrints deployments or require significant work on part of our Library staff to populate.

EPrints XML is complex and hard to work with. A simplified data structure could make working with our repository data much easier. If user/role restrictions were enforced in an extended EPrints API it could provide a clean JSON expression of a more general bibliographic record. Additionally would couple provide JSON documents suitable for direct ingest into Solr/Lunr search engines. At that stage it might also be desirable to allow updates to existing EPrints records via the extended API.