EPrints Technical Mailing List Archive

Message: #07663


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] help on running OAI Harvester 1.04 in eprints 3.3.16 on Ubuntu 17.10


Hi all,

We are trying to develop a method of harvesting a couple of other open archives into the Organic Eprints archive, using the OAI Harvester 1.04 package from http://files.eprints.org/798/ . I am working on a development server running eprints 3.3.16 on Ubuntu 17.10. At first, I try to harvest from the live Organic Eprints archive, since this has a working oai interface (at http://orgprints.org/cgi/oai2 ).

I am now able to run the harvest script, but I get the error: 

Can't locate object method "is_error" via package "HTTP::OAI::Record" at /data/eprints/archives/orgprints/cfg/plugins/EPrints/Plugin/Import/OAIPMH.pm line 232, <DATA> line 717.

I have tried to locate the error, but without luck so far, so it would be great if you could give me some indications as to where the error might be located. 

Here is some more detailed information that might be of help:

To avoid a "Can't locate EPrints.pm in @INC" error, I run:  
> export PERL5LIB=/data/eprints/perl_lib
before running the harvest command.

I run the script with the command:
> bin/oai/harvest orgprints --plugin=OAIPMH::OAI_DC --conf=orgprints

I get past the interactive dialogue:
Are you sure you want to make bulk changes to the eprint table in the orgprints repository [yes/no]
? yes

But then I get the error:

Can't locate object method "is_error" via package "HTTP::OAI::Record" at /data/eprints/archives/orgprints/cfg/plugins/EPrints/Plugin/Import/OAIPMH.pm line 232, <DATA> line 717.

Line 232 is in the sub create_eprint section, and line 229 contains "is error", but I am not able to see whether the error has to do with the creation of new eprints records or whether it comes from something earlier in the process.  

I have tried inserting some statements to print variables in the OAIPMH.pm file, and the script reads the relevant configuration options (at least I have tried printing from and metadataPrefix), and it seems to retrieve at least an OAI identifier before it fails (I can print "$id" around line 113 in OAIPMH.pm and get "oai:orgprints.org:34351" before the error, which is a correct eprint id for the date). I can also see that the script gets past the get_eprint line (by printing some text after line 115 - my $eprint = $self->get_eprint( $id ); ), so I guess it retrieves an EPrint object as well.  

The script has some lines that seems to write to an error log, but I cannot find such a log in either of the usual var directories. The apache error log and the orgprints log do not have any information relevant to the harvest script.

The configuration so far is very simple. I have copied the relevant "Stub OAI Harvesting" section from oai_harvester.pl to a new configuration file z_oai_harvester_orgprints.pl as shown below. I have also tried uncommenting the metadataPrefix and the default_values part to set a userid, and to make a user with this id in the development archive, but this does not make any difference with regard to the error.

$c->{oai_harvester}->{orgprints} = {
url ="" 'http://orgprints.org/cgi/oai2', # compulsory
# set => 'test_set', # optional
from => '2019-01-26', # optional, format is YYYY-MM-DD
# 'until' => '2019-01-27', # optional
# metadataPrefix => 'oai_dc', # optional, should be set by the OAIPMH/* plugin
# default_values => sub { # optional, gives a chance to set default values
# my( $session, $epdata, $header ) = @_;
#
# $epdata->{userid} = 'oai_harvester';
# $epdata->{eprint_status} = 'buffer';
#
# # $epdata->{FIELDNAME} = VALUE;
#
# },
};

I hope that you can help me move on. My perl skills are quite basic, but I can usually find my way around modifying files.

Best regards
Hugo

-- 
Hugo F. Alrøe , PhD, Lead researcher 
Sciper research, consulting and writing
Web:     hugo.alroe.dk