[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: [SOLVED] Re: Best way to import local (or remote) files through EPrints' XML file



Hi all,

And thanks for the 2.039082e26 replies! Before starting to answer to 
your replies, let me give a heads-up as to how to import files in EPrints:

I made it work (with John's help), and one can import one or more files 
using as a minimum XML something like the following within an 
<eprint></eprint> tag:

         <documents>
             <document>
                 <files>
                     <file>
<filename>sampletextfile.txt</filename>
<url>file:///tmp/sampletextfile.txt</url>
                     </file>
                 </files>
                 <format>text</format>
                 <language>en</language>
                 <security>validuser</security>
                 <main>sampletextfile.txt</main>
             </document>
             <document>
                 <files>
                     <file>
<filename>sampletextfile2.txt</filename>
<url>file:///tmp/sampletextfile2.txt</url>
                     </file>
                 </files>
                 <format>text</format>
                 <language>en</language>
                 <security>validuser</security>
                 <main>sampletextfile2.txt</main>
             </document>
         </documents>

In this example, two documents (and two files) are inserted, 
sampletextfile.txt and sampletextfile2.txt. EPrints then arranges for 
mime_type and all the rest.

Now let's answer to some of your questions:

@John: As far as URL's are concerned, the library staff decided to move 
off "handle" in order to make my life (and our admins' lives) easier. If 
you have any suggestions, though, I'd be glad to hear them!

@Andy: Thanks for the sample PHP code. As I explained earlier, at the 
moment I managed to import one or more custom files using the 
abovementioned syntax. The truth is that I haven't checked DSpace 
export-file format, but I hoped it would be something quite easy. If 
not, I'll definitely look at your code and see how you handle things, so 
thanks again!

@Rory: I am keeping an extended documentation of my work on how to 
migrate from DSpace to EPrints. When I'm done with migration (we have 
some deadlines, so they're my first priority), I'll update the Wiki and 
upload my python-code to github, as you suggested, in order to make 
other people lives easier. Before uploading my script I need to add more 
documentation in it in order to make it usable to others. Before that, 
though, I'll definitely reply to your email entitled "DSpace to EPrints 
migration resources" with a few steps that need to be followed as a start.

Thanks all again for the help!!



So let

On 23/09/2015 11:23 ??, Rory McNicholl wrote:
> Hello George, John, Adam, Andy and everyone,
>
> It has struck me that, on the subject of migrating a DSpace repository to an Eprints one, there seems to be distinct lack of published guidance. This is daft given that I know lots of these migrations have been performed (in some cases by my own colleagues).
>
> Googling "Dpsace to Eprints migration" results in more results pertaining to the inverse... and that can't be good!
>
> so....
>
>
> Rory McNicholl
> Lead developer
> Digital Archives & Research Technologies
> University of London Computer Centre
> Senate House
> Malet Street
> London
> WC1E 7HU
>
> t: +44 (0)20 7863 1344
> e: rory.mcnicholl at london.ac.uk
> w: http://www.ulcc.ac.uk/
>
> The University of London is an exempt charity in England and Wales.
>
> ________________________________________
> From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of John Salter <J.Salter at leeds.ac.uk>
> Sent: 22 September 2015 16:21
> To: 'eprints-tech at ecs.soton.ac.uk'
> Subject: [EP-tech] Re: Best way to import local (or remote) files through EPrints' XML file
>
> Hi,
> I hoped you'd look at the example first - and see that the documents/document bit was needed too. Sorry, I should have made that more clear.
>
> I'm not sure what the Metafield/Id is complaining about.
>
>  From memory, the $format bit is looking for a mime_type element - which is then split in a slash to get from 'image/jpeg' to
> $major='image',
> $minor = 'jpeg'
>
> There is
>> bin/epadmin redo_mime_types
> that may be useful - but won't stop those warnings.
>
> Hope that helps a bit more!
> Cheers,
> John
>
>
> -----Original Message-----
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of George Mamalakis
> Sent: 22 September 2015 15:52
> To: eprints-tech at ecs.soton.ac.uk
> Subject: [EP-tech] Re: Best way to import local (or remote) files through EPrints' XML file
>
> Hi again, John,
>
> It almost worked. The file was imported OK (I could see that through a
> consequent export), but it was absent from the deposit menu (view or
> edit). I uploaded a file from the menu, and from the export file the
> difference was that the uploaded file was within
> <documents><document></document></documents> tags. When I added them
> (without any contents), the file was imported as expected, but I
> received some warnings:
>
> Starting EPrints Repository.
> Connecting to DB ... done.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value in subroutine entry at
> /usr/share/eprints3/bin/../perl_lib/EPrints/MetaField/Id.pm line 50.
> Use of uninitialized value $format in split at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1684.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line 1689.
> Number of records imported: 1
> 267
> Ending EPrints Repository.
>
>
> The first warnings are always shown in my installation; the Document.pm
> part is new. As it seems, some values are not initialised correctly.
> Nonetheless, the import works, and when I visit the edit page, the
> missing fields are indicated. So the next thing I'll do is to infer
> their contents from their DSpace counterparts.
>
> Thanks again, when I manage to import it without a warning, I'll mark
> the thread as [SOLVED]. But I'll continue with my trials tomorrow.
>
> Have a nice evening everyone!
>
> George.
>
> On 22/09/2015 05:06 ??, John Salter wrote:
>> Hi George,
>> I think this: http://wiki.eprints.org/w/Import_From_URL is still valid - if your folder is copied onto the EPrints server, you can use a 'file' url:
>> <files>
>>             <file>
>>                   <filename>stuff.txt</filename>
>>                   <url>file:///home/data/dspace/123/stuff.txt</url>
>>             </file>
>> <files>
>>
>> I'm not sure exactly which bits of metadata are required - e.g. whether EPrints will calculate the filesize if you don?t specify it, and whether it will work out the mime-type.
>>
>> Cheers,
>> John
>>
>>
>> -----Original Message-----
>> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of George Mamalakis
>> Sent: 22 September 2015 14:41
>> To: eprints-tech at ecs.soton.ac.uk
>> Subject: [EP-tech] Best way to import local (or remote) files through EPrints' XML file
>>
>> Hi everybody!
>>
>> I'm very close to finishing my EPrints configuration + migration from
>> DSpace. The main thing that remains to be done, is the data migration part.
>>
>> I've written a python script that generates an EPrints XML file based on
>> a DSpace csv file, that I'll upload to EPrints Wiki when it'll be done.
>>
>> In order to complete it, I need to add the file, and I am not aware as
>> to what syntax I should use. I have a local folder whose subfolders
>> contain all DSPace files, where each subfolder name is the record id.
>> Therefore, my folder structure is somewhat like this:
>>
>> /home/data/dspace/{record_id}
>>
>> where {record_id} is the DSpace id of the specific record. What are the
>> minimum XML attributes that have to be added in my XML file in order for
>> EPrints to import the files? And how would an example XML entry look
>> like based on our example folder structure?
>>
>> Thanks all in advance!
>>
>
> --
> George Mamalakis
>
> IT and Security Officer,
> Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
> PhD (Aristotle Univ. of Thessaloniki),
> MSc (Imperial College of London)
>
> School of Electrical and Computer Engineering
> Aristotle University of Thessaloniki
>
> phone number : +30 (2310) 994379
>
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
>


-- 
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379