EPrints Technical Mailing List Archive

Message: #06672

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] We have messed up our EPrints


We are working on rebuilding the index, as suggested. The repository is large
(at least, large enough to reindex it easily), and the process repeatedly
exits with an error - probably an error with the PDF file.

The error message is:
Malformed UTF-8 character (fatal) at /usr/share/eprints3/bin/../perl_lib/EPrints/Utils.pm line 316.
xargs: bin/epadmin: exited with status 255; aborting

the command we use is:
seq 17561 52311 | xargs bin/epadmin --verbose reindex REAL eprint

On average we have this error about every 2000 eprint. The last error wos for
eprint no. 17560. Then we need to restart the indexing from the following
eprint no.

Are there any ways to avoid this? Is there a way telling epadnim not to abort
if encounters that error? Or get the indexer siply skip the erroneous character?

With best regards,

Andras Holl

Holl András
informatikai főigazgató-helyettes / deputy director (IT)
MTA Könyvtár és Információs Központ / MTA Library and Information Centre