[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Query regarding EPrints API



Hi Divije,

I have tested this on my EPrints 3.4.4 test instance and I can 
successful upload a 4.3MB PDF and download it and view it in a PDF 
viewer.? Here is the XML I used for a small file I test with to start:

<?xml version="1.0" encoding="utf-8"?>
<eprints xmlns="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprints.org%2Fep2%2Fdata%2F2.0&amp;data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C465f24a68e13455939ea08da93300b25%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637984131904219804%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=F4uLjui5R4V2CBblEFR8LBGmMv5tcC%2B9ZkXpBus7vxg%3D&amp;reserved=0";>
 ? <eprint>
 ??? <documents>
 ????? <document>
 ??????? <files>
 ????????? <file>
 ??????????? <filename>test.txt</filename>
<data>VGhpcyBpcyBhIHRleHQgZmlsZS4K</data>
 ????????? </file>
 ??????? </files>
 ????? </document>
 ??? </documents>
 ??? <type>article</type>
 ??? <title>Test title</title>
 ??? <abstract>Test abstract</abstract>
 ??? <ispublished>pub</ispublished>
 ??? <refereed>TRUE</refereed>
 ??? <date>2018-06-07</date>
 ??? <creators>
 ????? <item>
 ??????? <name>
 ????????? <family>Newman</family>
 ????????? <given>David</given>
 ??????? </name>
 ??????? <id>drn at ecs.soton.ac.uk</id>
 ????? </item>
 ??? </creators>
 ??? <userid>1</userid>
 ? </eprint>
</eprints>

All I did for the 4.3MB PDF was change the filename and the content 
inside the data tag (and the title and abstract values so I could 
differentiate between the two eprints on my repository). I used the 
following Curl command to submit the EPrints XML to my repository:

curl -X POST -u USERNAME:PASSWORD --data-binary 
"@/home/eprints/test_base64_doc_large.xml" -H "Content-Type: 
application/vnd.eprints.data+xml" https://eprints.example.org/id/contents

Obviously I have hidden private information with USERNAME and PASSWORD 
and used an example hostname.? All I did was run the Unix command base64 
to convert the PDF into base64 and write this to a file on disk.? I then 
just edited this file and inserted the EPrints XML around it.

Just as I was about to send this I thought 4.3MB might have been 
borderline for your files that exceed 4MB, so I tested with a 6.1MB file 
and this uploaded, downloaded and then loaded in a PDF viewer without 
issue.? Maybe the method you are using to generate the base64 encoded 
file or the library used to emulate my curl request is the issue.? I am 
not aware of anything that may have changed in recent versions of 
EPrints that means this works in 3.4.4 but not the version of EPrints 
your are running.? Although it is worth knowing which version of EPrints 
are you running?? One other thing I noted in your example XML there is 
an XML entity carriage return (&#13;).? I am not sure why this would be 
included in base64 data.? Obviously, this is for the small file example 
XML that you said was working.? So this is probably just a red herring.

Regards

David Newman

On 09/09/2022 2:32 pm, Divije Narasimhachar via Eprints-tech wrote:
> *CAUTION:* This e-mail originated outside the University of Southampton.
>
> Hi,
>
> I am a developer from Clarivate Technologies and I work for the 
> product converis. We have a feature where we export publications into 
> EPrints. The export is done in the form of an xml.
>
> When we export publications into EPrints we also export the files 
> attached to it. We do this by putting the encoded contents of the file 
> in the ?data? xml tag something like this.
>
> <documents>
>
> ???? <document>
>
> <files>
>
> <file>
>
> <filename>filename.txt</filename>
>
> <data>MjAyMiwOS0wNCAyMzoyNDo0M4MDEgRVJST1IgW2NvbS5jb252ZXJpy5kYXRhZXhjaGFuZ2U&#13;</dat
>
> </file>
>
> </files>
>
> <format>text/plain</format>
>
> <main>filename.txt</main>
>
> ???? </document>
>
> </documents>
>
> We have an issue where the export fails if the size of the attached 
> file exceeds 4MB.
>
> The export works fine if the file size is in kilo bytes or if there is 
> no file.
>
> Is there a workaround to this?
>
> Can we export the same file in parts(ex. 1MB at a time) to the same 
> publication instead of a huge size (ex. 10MB) at one shot?
>
> Thanks and Regards
>
> Divije Narasimhachar
>
> Senior Software Engineer
>
> *Clarivate?**
> *Accelerating innovation
>
> Confidentiality note: This e-mail may contain confidential information 
> from Clarivate. If you are not the intended recipient, be aware that 
> any disclosure, copying, distribution or use of the contents of this 
> e-mail is strictly prohibited. If you have received this e-mail in 
> error, please delete this e-mail and notify the sender immediately.
>
>
> *** Options:http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C465f24a68e13455939ea08da93300b25%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637984131904229762%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=CBWYDsCxr0ZGATKDYUx49H8D1A2XahI4JAJLOJyYnmc%3D&amp;reserved=0
> *** EPrints community wiki:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C465f24a68e13455939ea08da93300b25%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637984131904239709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=nUwblMkUvqnX4llimGgiW5WCxqlP2tD89W5xUZNXy5s%3D&amp;reserved=0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20220910/eb206798/attachment.html