[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: Some questions about SWORDv2/CRUD endpoint



Hi Richard,
My sympathies: I spent about a month trying figure something like this out, and just about got it working before I went on holiday for two weeks... Now I'm back I'm struggling to recall the details.  I was trying to push eprints XML and attached files into eprints via SWORD, and kept running up against similar problems.  What I found was that AtomPub only seemed to support minimal metadata - title, creator, summary - but nothing else e.g. Journal.  I can imagine that in your position as the Router, you don't want to have to be generating Eprints XML - presumably you want to be sending generic Atom, and not having to write native eprints XML?   Most of the documentation I found around SWORD tended to be dSpace-centric, using DCTERMS for the extended metadata.  I spent ages trying to adapt the EasyDeposit client , but could never get it to pass the XML to the right interpreter.  In the end I started from scratch with PHP-CURL and solved it quite quickly.

Not sure if this will help with your problem, Richard, but it might give some help to somebody.  Here's a basic test rig that works for me (unless I've broken it while tidying it up for public consumption )...
============================================================================
<?php
// create a new cURL resource

$ch = curl_init();
$username='<EPRINTSUSERID>';		  
$password='*********';  

curl_setopt($ch, CURLOPT_USERPWD, $username . ":" . $password);

// service end-points
//  ::::::::::: WRONG::::::::::: This entry point,  which you might assume is what you want, only gives you minimal AtomPub metadata in the resulting eprint object... 
//		 curl_setopt($ch, CURLOPT_URL, "http://researchonline.lshtm.ac.uk/sword-app/deposit/buffer";);   // may relate to SWORD 1.3, not sure

//RIGHT:::::::::::::::...Whereas this entry point plus the header below, passes the package to the eprint XML interpreter, and gives the full monty
curl_setopt($ch, CURLOPT_URL, "http://researchonline.lshtm.ac.uk/id/contents";);


curl_setopt($ch, CURLOPT_HEADER, 1);

$pkgheader=Array('X-Packaging: http://eprints.org/ep2/data/2.0',
				 'Content-Type: text/xml',
				 'Metadata-Relevant: true',
				 'X-Verbose: true' ,
				 'In-Progress: true'); 
		    
curl_setopt($ch,CURLOPT_HTTPHEADER,$pkgheader);

$DOCROOT= $_SERVER['DOCUMENT_ROOT'];
$file_in = "$DOCROOT/publications/eprint5.xml"; // test file is eprints XML, including base64 document files 
$data=file_get_contents($file_in);

curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

($result=curl_exec($ch) )|| die( "curl_exec failed: ". curl_error($ch));

//echo  $result;
$matches=array();
if(preg_match('/\/id\/eprint\/([0-9]*)/',$result,$matches)){
    $ID=$matches[1];
    print "\n\nID=$ID\n";
}else{	print "\n\nNo ID found in eprints response: Deposit failed";
}
// close cURL resource, and free up system resources
curl_close($ch);
?>

===================================================================


I can't attach the test file as it is not really public domain, but I can send it to you privately if you want it.

As I say, this might not be quite the problem you're trying to solve, but it was the only way I could get it working for a similar situation.  It suited me because I already had the code to generate the EPXML.  

Welcome any insight from the Community.



Andy Reid
Research Information Manager
Room G43, Executive Office
London School of Hygiene & Tropical Medicine
Keppel St
LONDON WC1E 7HT
+44 020-7927-2618
http://orcid.org/0000-0002-2500-2980


>>> Richard Jones <richard at cottagelabs.com> 29 August 2015 12:09 >>>
Just a little more information ...


I've found an Atom.xsl stylesheet in EPrints/Plugins/Import/XSLT which seems to imply that there /is/ a default atom import plugin. My sword deposit contains atom metadata, so this stylesheet should work nicely to create a new eprint with the relevant metadata. The only thing is, it doesn't appear to get called during deposit. In Eprints/Apache/CRUD.pm, there's this code (Line 762 in sub import_plugins):

my @plugins = $self->repository->get_plugins(
type => "Import",
can_produce => $self->accept_type,
%params,
);

In this code can_produce actuall gets dataobj/eprint, which I think is correct, but I can't work out what %params contains. It ought to contain application/atom+xml; type=entry I'm finding it difficult to trace this through the code to find out where it is set and what it should contain.

Cheers,

Richard




and it's being called with an accept_type of "application/atom+xml; type=entry". My guess is that the Atom.xsl stylesheet is not being understood as the plugin which can handle this format.

I read through this documentation:

http://wiki.eprints.org/w/Import_and_Export_Plug-ins

And it appears to suggest that the XSLT plugins need a wrapper around them for the specific mimetype that they need to work with - is this right? Or is there some way to get the XSLT import plugin to run as is?

If there's some other documentation that I've overlooked so far, any links much appreciated!

Cheers,

Richard


On 28 August 2015 at 09:50, Richard Jones <richard at cottagelabs.com> wrote:


Actually, I see an AtomMultipart.pm in the EPrints/Plugin directory, and it in turn tries to load a plugin which supports application/atom+xml, but I can't find a plugin that supports that particular format. I think this is probably the same issue, then, as my (1) below - there doesn't appear to be a default atom plugin - is that right?

Cheers,

Richard


On 28 August 2015 at 09:32, Richard Jones <richard at cottagelabs.com> wrote:


Hi Folks,

We're currently working on a new version of the Jisc Publications Router, and am looking into how best to put the content into an EPrints via SWORDv2. I'm encountering a few oddities, though, and I wonder if someone can clarify how that endpoint works for me?

1/ Posting an atom entry document to /id/content should result in the creation of a new eprint populated with the metadata from that document, but instead it creates an eprint with no metadata and the atom xml attached as a file called "main.bin". Perhaps I'm missing some crosswalk configuration?

http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#protocoloperations_creatingresource_entry

2/ My service document says:

<acceptPackaging>http://purl.org/net/sword/package/Binary</acceptPackaging>

But when I PUT a file of this format tot he media resource I get a "package format not supported" error.

Also, it does not say that it supports SimpleZip, but if I PUT a SimpleZip it works fine. Is there a place that I can customise the accept/acceptPackaging entries in the service document?

3/ Completing deposit doesn't appear to work. When I POST to the eprint's Edit-IRI, I get a 405 Method Not Allowed, but it should update the eprint

http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#continueddeposit_complete


Looking at the code, with my limited Perl skills, it looks like the full SWORDv2 protocol is not supported - is there some documentation that will tell me what features are supported? For example, does multipart deposit work? http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#protocoloperations_creatingresource_multipart

Any tips/pointers much appreciated.

Cheers,

Richard

-- 

Richard Jones, 

Founder, Cottage Labs 
t: @richard_d_jones, @cottagelabs
w: http://cottagelabs.com


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/




-- 

Richard Jones, 

Founder, Cottage Labs 
t: @richard_d_jones, @cottagelabs
w: http://cottagelabs.com


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/




-- 

Richard Jones, 

Founder, Cottage Labs 
t: @richard_d_jones, @cottagelabs
w: http://cottagelabs.com


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/




-- 

Richard Jones, 

Founder, Cottage Labs 
t: @richard_d_jones, @cottagelabs
w: http://cottagelabs.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20150907/7790ae0b/attachment-0001.html