EPrints Technical Mailing List Archive

Message: #04341


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Migrating from D-space (with files)


Hi Tim,

If you could, it would be wonderful. Yesterday I spent the whole day trying to understand its database in order to write my custom script that exchanges records between the two databases (in python using some ORM). To be sure I understood the database design correctly, I enabled database logging and imported one record. Through this procedure I saw that many tables were affected when inserting a new record (more than those I'd initially imagined), so I decided to give DSpace's import plugin a try. I thought that I'd write my custom plugin that inherits DSpace.pm, readjust the GRAMMAR to fit my DSpace database, and write additional callbacks where needed. OK, my perl skills are limited, but I understood the existing code by reading it, so I imagined it wouldn't be that hard writing a few callbakcs to do specific string manipulations in perl. My only problem following this approach would still be the file part, where I thought I'd follow the directions of this document:

https://ejournals.bc.edu/ojs/index.php/ital/article/download/1861/pdf

which would probably need a few changes from my part to fit my needs. This paper describes a two-script importer (and code is supplied), where the first script imports only the metadata, the files are copied on the system "by hand", and the second script updates the imported eprints to "link" with the appropriate files.

Since your script generates EPrints XML, it means that imported records will update all necessary tables, so everything should work like a charm! So, if you find it, I'd be obliged if you could send it, and I'll try to make the relevant changes to fit my DSpace database.

Thanks again for the help!

On 19/06/2015 02:57 πμ, Timothy Miles-Board wrote:
Hi George,

I've done this a couple of times.

I worked with (CSV) dumps of the DSpace database tables and wrote a script to parse/join them and convert the whole lot to EPrints XML.

If you are still looking for help I could dig the script out and you can see if you can adapt it to your needs.

Regards,

Tim

Timothy Miles-Board
Web & Repositories Development Specialist, University of London Computer Centre
020 7863 1342  |  07742 970 351  | timothy.miles-board@london.ac.uk | @drtjmb
The University of London is an exempt charity in England and Wales

________________________________________
From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of George Mamalakis <mamalos@eng.auth.gr>
Sent: 26 May 2015 4:48 PM
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech]  Migrating from D-space (with files)

Hi all,

I assumed that this scenario should be very common, but after googling
it I realised that it's quite hard to find a straightforward answer.

So, the question is as follows:

What are the needed steps in order to migrate a D-space system to eprints?

I see that there is this import module
(./perl_lib/EPrints/Plugin/Import/DSpace.pm) in eprints, which (at first
glance) doesn't seem to handle files (maybe I'm wrong). Moreover, as it
is stated in the plugin, before migrating from D-space to eprints, one
should subclass it in order to "refine the grammar used". Of course,
from the admin interface I see that there is a D-space specific import,
which -if I understood correctly- is using the import plugin just mentioned.

Given these facts, for the meatadata I just have to subclass the
DSPace.pm plugin using the correct grammar? And then, what should I do
with associated files? Is there a way to merge this two steps in order
to avoid mistakes?

Thank you all for your time in advance,

George.

--
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/


--
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379