[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: Extracting authors



On 06/08/14 15:47, Andrew Beeken wrote:
> Hello all!
>
> I?m currently looking at building applications that sit to the side
> of Eprints but tap into the data it stores. What I?m hoping to be
> able to get, possibly via an OAI scrape, is a list of all of the
> authors stored in the system so that I can create a lookup table in
> my app ? is this straightforward?

As an OAI-PMH set, yes.... very easy

$oai->{sets} = [
   ......
   { id=>"creators", allow_null=>0, fields=>"creators_name"},
   ......
];


As a cgi call, or in code, is slightly more complicated.

My initial thought would be to create a script that goes through the 
dataset and builds a "list of hashes": where the author details are 
stored in a hash, and one of those key/value pairs is a list of 
eprintids for the records they are listed as authors. Store this 
data-object on disk, and your API can call that to do whatever 
calculations you need.

Depending on the size and activity of your Repository, you can build the 
"list-of-hashes" hourly, daily, or whatever.

-- 

Ian Stuart.
Developer: ORI, RJ-Broker, and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.