EPrints Technical Mailing List Archive

Message: #04775


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: enable-web-imports


I thought the --enable-web-imports just worked, and I've seen it work before, but this was around 6 years ago.  It's always been more trouble than it's worth to me, and my general approach to big jobs like this is to divide the work into steps, and make each step as simple as possible.  Even if that approach seems like more work, you've got far more recoverability and debug-ability in the process.

I don't think you should be quoting values inside an XML tag, but you do need to make sure the data is encoded correctly.  The only illegal characters in XML as far as I know are '<' and '&', so I don't know why this would be failing. 

--
Adam Field
Business Relationship Manager and Community Lead
EPrints Services
+44 (0)23 8059 8814





On 24 Sep 2015, at 16:08, cmdt-news@cmdt.ch wrote:

> hi adam
> 
> 
> On 24.09.2015 16:23, Field A.N. wrote:
>> With 40k of publications, I would advise downloading all of them with
>> wget and putting them into a predictable directory structure, then
>> adjusting the XML to point at local files.
> 
> thanks for your advise. but:
> 
> 1. the servers are just standing side-by-side (=very fast connection)
> and therefore the easiest way to get the file on the right place in the
> new repo.
> 
> 2. is the question in general about the option --enable-web-imports
> have you got any idea why the API import isn't working although the wget
> gets the file?
> what does it mean "file imports disabled"?
> is there somewhere else something to enable?
> maybe...
> 
> 3. quote or not quoted?
> in xml i added single quote (because of possible whitespaces in filename
> and url) like
> <files>
> <file>
> <filename>'CSA_02efl.pdf'</filename>
> <mime_type>application/pdf</mime_type>
> <url>'https://www.alexandria.unisg.ch/export/DL/21352.pdf'</url> >>>
> still 403 when CSA_02efl.pdf
> </file>
> </files>
> 
> which produces an 403 error
> 
> removing the the quotes produces nasty error:
> Use of uninitialized value within %lookup in array element at
> /usr/share/eprints3/bin/../perl_lib/EPrints/Database.pm line 2647,
> <DATA> line 960.
> Use of uninitialized value $format in split at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line
> 1684, <DATA> line 960.
> Use of uninitialized value $major in concatenation (.) or string at
> /usr/share/eprints3/bin/../perl_lib/EPrints/DataObj/Document.pm line
> 1689, <DATA> line 960...........
> 
> ????
> 
> 
> 4. local import without quotes works
> 
> with xml
> <files>
> <file>
> <filename>CSA_02efl.pdf</filename>
> <mime_type>application/pdf</mime_type>
> <url>https://www.alexandria.unisg.ch/export/DL/21352.pdf</url> >>> still
> 403 when CSA_02efl.pdf
> </file>
> </files>
> 
> and
> perl ~/bin/import alex --verbose --enable-import-fields --force --update
> --enable-file-imports archive XML /home/cmueller/import/epxmleprint_1.xml
> 
> 
> 
> 
> 
> thanks for any suggestions
> stof
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/