EPrints Technical Mailing List Archive

Message: #06656


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] help on configuring OAI-PMH harvester in eprints 3.3.14


Hi Alfredo,

I haven't used the harvested, but from a quick look at the code, the 'http://idei.fr' will render any OAI identifiers that match with a link to that service.

Unless you are harvesting from idei.fr, this code will not do anything (so you can leave it - but you might want to extend for sources you are harvesting from).

 

Looking at the bin script, you need to set up some configuration before you can run a harvest.

I would copy the commented-out block (starting with $c->{oai_harvester}->{stub} ) at the end of oai_harvester.pl into [EPRINTS_ROOT]/archives/[ARCHIVEID]/cfg/cfg.d/z_oai_harvest_ABC.pl

- where 'ABC' is a name you would associate with the repository you are harvesting.

 

Now un-comment at least the 'url' line - and if necessary, add other values.

For testing, it might be worth adding a known set / time period so you can collect a small number of records from the source - e.g.

$c->{oai_harvester}->{ABC} = {

                url ="" 'http://ABC.com/oai',        # compulsory

                set => 'driver',                                   # optional

                from => '2017-06-29',                                                     # optional, format is YYYY-MM-DD

#             'until' => '2011-12-31',                                    # optional

#             metadataPrefix => 'oai_dc',                                        # optional, should be set by the OAIPMH/* plugin

#             default_values => sub {                                                # optional, gives a chance to set default values

#                             my( $session, $epdata, $header ) = @_;

#

#                             $epdata->{userid} = 1234;

#                             $epdata->{eprint_status} = 'archive';

#

#                             $epdata->{FIELDNAME} = VALUE;                          

#             },

};

 

With this configuration in place, I think you should be able to do:

> bin/harvest ARCHIVE_ID --conf=ABC --plugin=OAIPMH::OAI_DC

 

As I said - I've never used this - and the above is from a quick skim-read of the code, but I hope this gets you started!

 

Cheers,

John

 

PS I'll look at your other email now :o)

 

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Alfredo Cosco
Sent: 29 June 2017 10:00
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] help on configuring OAI-PMH harvester in eprints 3.3.14

 

Hi all,

i've to configure the harvester module in eprints 3.3.14 but documentation is quite unconsistent and I need for a little help.

 

I downloaded the module from this link: http://files.eprints.org/798/

 

Has anyone managed to install and use this feature?

 

In the cfg/cfg.d/oai_harvester.pl file, before the sample, there is a piece of code that points to http://idei.fr/, is it a sample too and has to be configured or I've to leave it as it is?

 

Thanks

Alfredo