[EP-tech] duplicate detection in EPrints 3.3

I would like to run a script that will go through my repository (3.3.12) and report any likely duplicates based on title (and possibly author).

What is the best way of doing this?

I found the following two plugins in EPrints Files:

?         Sebastien Francois? EPrints 2 script: http://files.eprints.org/107/

?         Jon Hallet?s EPrints 3>3.1  script: http://files.eprints.org/640/

In addition,

?         There is a title_duplicates script in /cgi/users/lookup/ http://wiki.eprints.org/w/Cgi/users/lookup/

?         Page 40 of this file (http://www.eprints.org/software/training/programming/api_techniques.pdf)  refers to a duplicate detection script in the bin folder as an example ? I couldn?t find this script ? probably just an example of what could be done.

Is the Jon Hallett?s script in EPrints Files the most up-to-date version available?

Has anyone created a Bazaar version for duplicate detection and/or is there is something more recent that I am missing?


Tomasz Neugebauer
Digital Projects & Systems Development Librarian
Libraries / Biblioth?ques
Concordia University / Universit? Concordia

