EPrints Technical Mailing List Archive

Message: #04732

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Best way to import local (or remote) files through EPrints' XML file

Hi George,
Here's a chunk of PHP I put together recently to generate the documents/files section of an Eprints XML upload.  This puts all the files into one document tag, which may or may not be waht you want to do - I'm not sure exactly how standard the mainfile configuration is on our system, but it seems to only allow the one file per document to be downloaded.  So either you need to enclose each file in a separate document tag, or as I eventually did, zip all the files and push that up as one file.  This version is the 'one document, many files' approach, but I can send you the Zip version as well if you like.

function eprints_xml_OAfiles($row){    # $row is the metadata values for this record

global $link;
global $dataset;
$filebase="$docroot/publications/administration/....  <where the files live > /";

$pub_id = $row['pub_id'];
$oaPub_ID = $row['oaPub_ID']; 

   $PM= $row['pubmedid']  ;   

 $files_query = "SELECT
        `content_oaManuscript`,   # Manuscript
                                    oaPub_ID = $oaPub_ID
        and upload_oaManuscript = 1
                              #  ORDER BY surname";
        $files_result = mysql_query ($files_query,$link)
        or die ("Query failed:$files_query");
        while ($f = mysql_fetch_array($files_result)) { #build file metadata and base64 data
          $record=print_r($f,TRUE); echo "<!-- $record -->";
    $filename = $f['file_oaManuscript'];
   $mimetype = $f['file_oaManuscript_mimetype'];
   if(FALSE === ($STUFF=file_get_contents($filebase.$filename))){die("\n\nfailed to get file: $filebase$filename");}
   $filesize = strlen($STUFF);
   $file_modified= $f['modified_oaManuscript'];
$filesXML .= "
    <filesize>$filesize </filesize>
    <data encoding='base64'>";

$filesXML .= $base64;

$filesXML .= "</data>

   }# ends while ($row2 = mysql_fetch_array($coded_result)) 


return $cit = <<<EOC


>>> George Mamalakis <mamalos@eng.auth.gr> 22 September 2015 14:40 >>>
Hi everybody!

I'm very close to finishing my EPrints configuration + migration from
DSpace. The main thing that remains to be done, is the data migration part.

I've written a python script that generates an EPrints XML file based on
a DSpace csv file, that I'll upload to EPrints Wiki when it'll be done.

In order to complete it, I need to add the file, and I am not aware as
to what syntax I should use. I have a local folder whose subfolders
contain all DSPace files, where each subfolder name is the record id.
Therefore, my folder structure is somewhat like this:


where {record_id} is the DSpace id of the specific record. What are the
minimum XML attributes that have to be added in my XML file in order for
EPrints to import the files? And how would an example XML entry look
like based on our example folder structure?

Thanks all in advance!

George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/