EPrints Technical Mailing List Archive

Message: #03359


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Extracting authors


On 06/08/14 15:47, Andrew Beeken wrote:
Hello all!

I’m currently looking at building applications that sit to the side
of Eprints but tap into the data it stores. What I’m hoping to be
able to get, possibly via an OAI scrape, is a list of all of the
authors stored in the system so that I can create a lookup table in
my app – is this straightforward?

As an OAI-PMH set, yes.... very easy

$oai->{sets} = [
  ......
  { id=>"creators", allow_null=>0, fields=>"creators_name"},
  ......
];


As a cgi call, or in code, is slightly more complicated.

My initial thought would be to create a script that goes through the dataset and builds a "list of hashes": where the author details are stored in a hash, and one of those key/value pairs is a list of eprintids for the records they are listed as authors. Store this data-object on disk, and your API can call that to do whatever calculations you need.

Depending on the size and activity of your Repository, you can build the "list-of-hashes" hourly, daily, or whatever.

--

Ian Stuart.
Developer: ORI, RJ-Broker, and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.