EPrints Technical Mailing List Archive

Message: #06253


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Import problems!


Further down the rabbit hole…

 

I’ve tried a full archive import (without files! I don’t have an infinite amount of space on my dev environment!) and it’s imported around 6500 records from about 14500 – so less than half!

 

I’ve also done some searching on the tech list and found this post: http://www.eprints.org/tech.php/13616.html which suggests that the errors are coming from non-ASCII characters in the filenames etc.

 

Going back to my text export: https://drive.google.com/open?id=0B67FaE28LeB-c21LZ0Y5YmNHRTQ I’ve taken a look at the three records that won’t import. Now, on the basis of that post I removed the document details from one and – the record imported. So the issue lies in the documents but not ALL of the documents. The document from that test batch that imported had no spaces in the filename so – perhaps that’s something?

 

The one annoying factor in this is that even with web imports off the import script still fails at these records – I would have expected it to have created the metadata record but ignore the file, but that’s not the case.

 

Has anyone encountered this particular snafu? I feel like I’m getting close to figuring this all out, though!

 

Andrew

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 07 February 2017 14:22
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

Ah, cool! Thanks! All sorted on that front and added “Export/Import Subjects…” into my checklist for migration!

 

Now, if I could figure out what’s up with those records I think I’m about there…

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Lizz Jennings
Sent: 07 February 2017 13:54
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

Not sure on how you work on the first half of that, but there’s a helpful video on working with subject trees:

 

https://wiki.eprints.org/w/Training_Video:Subject_Trees

 

You’ll probably need to copy your subject tree from the old repository over, and then run import_subjects pointing at that file.

 

Lizz

 

--

Lizz Jennings BA MSc ACLIP MCLIP (Revalidated 2015)

Research Data Librarian (Systems)

The Library 4.10, University of Bath, Bath, BA2 7AY UK

Ext. 3570 (External 01225 383570)

E.Jennings@bath.ac.uk

Research Data Management: http://www.bath.ac.uk/research/data

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 07 February 2017 13:42
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

I’m back!

 

Progress has been made. So, I’ve tried this with clearing out the eprints table and starting the import from fresh – success! Of a kind…

 

It’s imported one record; this one – http://eprints.lincoln.ac.uk/25828. This is a complete record with related documents. Hurrah!

 

However there should be a total of four records imported. The import threw up some new and interesting errors:

 

Starting EPrints Repository.

Connecting to DB ... done.

Failed to retrieve http://eprints.lincoln.ac.uk/25934/1/1602_Global%20Goals_French_Art6Clean.docx: 401 Authorization Required

document.52811 failed to create subdataobj on document.files at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013.

eprint.25934 failed to create subdataobj on archive.documents at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013.

Failed to retrieve http://eprints.lincoln.ac.uk/25845/1/BehavEcol2016.pdf: 401 Authorization Required

document.52360 failed to create subdataobj on document.files at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013.

eprint.25845 failed to create subdataobj on archive.documents at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013.

Failed to retrieve http://eprints.lincoln.ac.uk/25940/1/25940%20Proof_APPLAN_4392.pdf: 401 Authorization Required

document.52899 failed to create subdataobj on document.files at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013.

eprint.25940 failed to create subdataobj on archive.documents at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1013.

Number of records imported: 1

25828

Ending EPrints Repository.

 

Now, my interpretation of this is that it’s unable to get the documents that are not open access – which is fine, however how can I get it to grab any protected documents? Can I provide username and password alongside the import? Also, is this likely to be why the records are not being imported correctly or is there something else here I’m misinterpreting?

 

Finally, looking at the repository I can access the record directly through the URL, however I can’t see anything through the browse by… view. I’ve regenerated views/abstracts and started the indexer. The search gives this error:

 

The Lincoln Repository has encountered an error:

 

The top level subject (id=jacs) for field "subjects" does not exist. The site admin probably has not run import_subjects. See the documentation for more information.

 

As always, any thoughts are appreciated!

 

Andrew

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 06 February 2017 11:25
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

Hi all!

 

So, a week or so later and I’m back onto this. I’ve double verbosed it and exposed a bunch of extra debug stuff. One line in particular stands out for me on all 5 test records I’m trying to import:

 

Database execute debug: INSERT INTO `eprint` (`eprintid`) VALUES (?)

 

So… it looks like it’s not inserting anything into the database?

 

I’ve put the error log as well as the XML snapshot I’m trying to import up onto Google Drive here if anyone has time to take a look: https://drive.google.com/open?id=0B67FaE28LeB-Z3NZVGtRQkFsbVU

 

I am trying to get through this without shouting for help at every turn, I promise! ;P

 

Andrew

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 01 February 2017 14:39
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

That’s interesting – I’ll have a check although this action is only being done on a subset of 4 or 5 records, all of which are fairly recent, so I would have hoped they’d have a status! The only other alternative could possibly be missing status definitions etc? I shall take a look at bringing phrases over as well.

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of John Salter
Sent: 01 February 2017 14:17
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

'ne' means 'not equal' - when comparing strings ('!=' means not equal when comparing numbers).

 

At a guess, something in your export doesn't have an eprint_status set (all EPrints should have this set). You may be able to find these in the database by trying something like:

mysql>  SELECT COUNT(*), eprint_status from eprint GROUP BY eprint_status;

This should result in the eprint_status values of inbox, buffer, archive and deletion, with a count next to each. Any count for 'NULL' would show where the eprint_status is not set, and therefore where the uninititalized string is on the import.

 

Cheers,

John

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Beeken
Sent: 01 February 2017 14:03
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

Cheers Adam,

 

That seems to have sorted things out. However, I have another new and interesting error!

 

Use of uninitialized value in string ne at /usr/share/eprints3/bin/../perl_lib/EPrints/DataSet.pm line 1090.

 

The line in question is:

 

if( $dataobj->get_value( "eprint_status" ) ne $self->{id} )

 

and the ne in that line is an operator if I’m not mistaken? So is this an issue with Perl running on the server?

 

Andrew

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Adam Field
Sent: 01 February 2017 13:13
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Import problems!

 

If you’ve added all your metadata fields to the new repositories configuration, you need to do:

 

<eprints_root>/bin/epadmin update <repositoryid>

 

This will update your database structure.

 

 

isc

Adam Field
SHERPA services analyst developer

 

 

From: <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Andrew Beeken <anbeeken@lincoln.ac.uk>
Reply-To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Date: Wednesday, 1 February 2017 12:43
To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Import problems!

 

Hello all! Thanks for the pointers yesterday regarding import/export. I’m at a point where I now have exported data and am trying to import it. Initially I was getting errors relating to missing metadata fields which I’d thought I’d corrected, however I am now finding errors relating to missing SQL tables:

 

SQL ERROR (execute): SELECT `eprintid`,`pos`,`subjects_loc` FROM `eprint_subjects_loc` WHERE `eprintid` IN (25940)

SQL ERROR (execute): Table 'lirolem.eprint_subjects_loc' doesn't exist

DBD::mysql::st fetchrow_array failed: fetch() without execute() at /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 2674.

DBD::mysql::st execute failed: Table 'lirolem.eprint_creators_browse_id' doesn't exist at /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 3211.

SQL ERROR (execute): SELECT `eprintid`,`pos`,`creators_browse_id` FROM `eprint_creators_browse_id` WHERE `eprintid` IN (25940)

SQL ERROR (execute): Table 'lirolem.eprint_creators_browse_id' doesn't exist

DBD::mysql::st fetchrow_array failed: fetch() without execute() at /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 2674.

 

Now, my question is, do I need to do something to rebuild the tables relating to the metadata in some way prior to an import? Or is there something else I’m missing here? Maybe not all the field definitions?

 

Andrew


The University of Lincoln, located in the heart of the city of Lincoln, has established an international reputation based on high student satisfaction, excellent graduate employment and world-class research.


The information in this e-mail and any attachments may be confidential. If you have received this email in error please notify the sender immediately and remove it from your system. Do not disclose the contents to another person or take copies.

Email is not secure and may contain viruses. The University of Lincoln makes every effort to ensure email is sent without viruses, but cannot guarantee this and recommends recipients take appropriate precautions.

The University may monitor email traffic data and content in accordance with its policies and English law. Further information can be found at: http://www.lincoln.ac.uk/legal.


Jisc is a registered charity (number 1149740) and a company limited by guarantee which is registered in England under Company No. 5747339, VAT No. GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203 697 5800.

Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited by guarantee which is registered in England under company number 2881024, VAT number GB 197 0632 86. The registered office is: One Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.