EPrints Technical Mailing List Archive

Message: #04731


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Best way to import local (or remote) files through EPrints' XML file


It should work OK - I've used it in the past :o)
There's a flag on the bin/import script that I think you need to set:
--enable-file-imports
that allows files on the local file system to be retrieved.
There's also
--enable-web-imports
that allows external URLs to be specified.

There's an example xml chunk below that worked for 3.3.10.

Also, I'm not sure what sort of persistent URLs DSpace uses.
If you're planning to redirect all requests from what was your DSpace server to your Eprints one, you might want to be able to capture DSpace style URLs and respond in a sensible way.
This could be done with a custom URL trigger, and an EPrint search for a matching 'old_system_identifier' (see XML below).
If this sounds of interest, let me know!

Cheers,
John

<eprints>
  <eprint>
    <title>Ammeter</title>
    <collection>physics_historical_instruments</collection>
    <type>image</type>
    <corp_creators>
      <item>Acme Electrical Works</item>
    </corp_creators>
    <abstract>...</abstract>
    <physical_identifier>Catalogue number : 1020</physical_identifier>
    <publisher>University of Leeds</publisher>
    <copyright_holders>
      <item>Copyright University of Leeds, School of Physics and Astronomy</item>
    </copyright_holders>
    <source>Original held in University of Leeds, School of Physics and Astronomy</source>
    <keywords>Electric</keywords>
    <old_system_identifier>80629</old_system_identifier>
    <documents>
      <document>
        <files>
          <file>
            <url>file:///export/DigitisationLabStore/migration/PHYSICS/FILES/80631/lr_physics1 034.jpg</url>
            <filename>lr_physics1 034.jpg</filename>
            <mime_type>image/jpeg</mime_type>
          </file>
        </files>
        <format>image</format>
        <formatdesc>Ammeter - View 1. Low resolution</formatdesc>
        <main>lr_physics1 034.jpg</main>
      </document>
    <documents>
  </eprint>
</eprints>

-----Original Message-----
From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of George Mamalakis
Sent: 22 September 2015 15:22
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Re: Best way to import local (or remote) files through EPrints' XML file

Hi John,

Nice idea with the file-url. I'll try it and see if it works!

Thanks!

On 22/09/2015 05:06 μμ, John Salter wrote:
> Hi George,
> I think this: http://wiki.eprints.org/w/Import_From_URL is still valid - if your folder is copied onto the EPrints server, you can use a 'file' url:
> <files>
>            <file>
>                  <filename>stuff.txt</filename>
>                  <url>file:///home/data/dspace/123/stuff.txt</url>
>            </file>
> <files>
>
> I'm not sure exactly which bits of metadata are required - e.g. whether EPrints will calculate the filesize if you don’t specify it, and whether it will work out the mime-type.
>
> Cheers,
> John
>
>
> -----Original Message-----
> From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of George Mamalakis
> Sent: 22 September 2015 14:41
> To: eprints-tech@ecs.soton.ac.uk
> Subject: [EP-tech] Best way to import local (or remote) files through EPrints' XML file
>
> Hi everybody!
>
> I'm very close to finishing my EPrints configuration + migration from
> DSpace. The main thing that remains to be done, is the data migration part.
>
> I've written a python script that generates an EPrints XML file based on
> a DSpace csv file, that I'll upload to EPrints Wiki when it'll be done.
>
> In order to complete it, I need to add the file, and I am not aware as
> to what syntax I should use. I have a local folder whose subfolders
> contain all DSPace files, where each subfolder name is the record id.
> Therefore, my folder structure is somewhat like this:
>
> /home/data/dspace/{record_id}
>
> where {record_id} is the DSpace id of the specific record. What are the
> minimum XML attributes that have to be added in my XML file in order for
> EPrints to import the files? And how would an example XML entry look
> like based on our example folder structure?
>
> Thanks all in advance!
>


-- 
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379



*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/