[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] help on running OAI Harvester 1.04 in eprints 3.3.16 on Ubuntu 17.10

Hi all,

We are trying to develop a method of harvesting a couple of other open
archives into the Organic Eprints archive, using the OAI Harvester 1.04
package from https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffiles.eprints.org%2F798%2F&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Caba47818e7434d7dabfe08d6851c019e%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=n0o9%2FKflWKplxHCNr39K9zsH3dBsSeFUq72vSDkDuPQ%3D&reserved=0 . I am working on a development
server running eprints 3.3.16 on Ubuntu 17.10. At first, I try to harvest
from the live Organic Eprints archive, since this has a working oai
interface (at https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Forgprints.org%2Fcgi%2Foai2&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Caba47818e7434d7dabfe08d6851c019e%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=jIf5eqV3wiUZSnoVJSQp%2FYX1R1aSSyAf8kQH2yXvgTM%3D&reserved=0 ).

I am now able to run the harvest script, but I get the error:

Can't locate object method "is_error" via package "HTTP::OAI::Record" at
line 232, <DATA> line 717.

I have tried to locate the error, but without luck so far, so it would be
great if you could give me some indications as to where the error might be

Here is some more detailed information that might be of help:

To avoid a "Can't locate EPrints.pm in @INC" error, I run:
> export PERL5LIB=/data/eprints/perl_lib
before running the harvest command.

I run the script with the command:
> bin/oai/harvest orgprints --plugin=OAIPMH::OAI_DC --conf=orgprints

I get past the interactive dialogue:
Are you sure you want to make bulk changes to the eprint table in the
orgprints repository [yes/no]
? yes

But then I get the error:

Can't locate object method "is_error" via package "HTTP::OAI::Record" at
line 232, <DATA> line 717.

Line 232 is in the sub create_eprint section, and line 229 contains "is
error", but I am not able to see whether the error has to do with the
creation of new eprints records or whether it comes from something earlier
in the process.

I have tried inserting some statements to print variables in the OAIPMH.pm
file, and the script reads the relevant configuration options (at least I
have tried printing from and metadataPrefix), and it seems to retrieve at
least an OAI identifier before it fails (I can print "$id" around line 113
in OAIPMH.pm and get "oai:orgprints.org:34351" before the error, which is a
correct eprint id for the date). I can also see that the script gets past
the get_eprint line (by printing some text after line 115 - my $eprint =
$self->get_eprint( $id ); ), so I guess it retrieves an EPrint object as

The script has some lines that seems to write to an error log, but I cannot
find such a log in either of the usual var directories. The apache error
log and the orgprints log do not have any information relevant to the
harvest script.

The configuration so far is very simple. I have copied the relevant "Stub
OAI Harvesting" section from oai_harvester.pl to a new configuration file
z_oai_harvester_orgprints.pl as shown below. I have also tried uncommenting
the metadataPrefix and the default_values part to set a userid, and to make
a user with this id in the development archive, but this does not make any
difference with regard to the error.

$c->{oai_harvester}->{orgprints} = {
url => 'https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Forgprints.org%2Fcgi%2Foai2&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Caba47818e7434d7dabfe08d6851c019e%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&amp;sdata=jIf5eqV3wiUZSnoVJSQp%2FYX1R1aSSyAf8kQH2yXvgTM%3D&amp;reserved=0', # compulsory
# set => 'test_set', # optional
from => '2019-01-26', # optional, format is YYYY-MM-DD
# 'until' => '2019-01-27', # optional
# metadataPrefix => 'oai_dc', # optional, should be set by the OAIPMH/*
# default_values => sub { # optional, gives a chance to set default values
# my( $session, $epdata, $header ) = @_;
# $epdata->{userid} = 'oai_harvester';
# $epdata->{eprint_status} = 'buffer';
# # $epdata->{FIELDNAME} = VALUE;
# },

I hope that you can help me move on. My perl skills are quite basic, but I
can usually find my way around modifying files.

Best regards

Hugo F. Alr?e , PhD, Lead researcher
Sciper research, consulting and writing
Email:    hugo.f.alroe at gmail.com
Web:     hugo.alroe.dk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20190128/200c7b9c/attachment-0001.html