EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #05895
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Atom.xsl Patch Submission
- To: eprints-tech@ecs.soton.ac.uk
- Subject: Re: [EP-tech] Atom.xsl Patch Submission
- From: Sebastian Faubel <sebastian@semiodesk.com>
- Date: Thu, 1 Sep 2016 09:58:09 +0200
Semiodesk GmbH | Werner-von-Siemens-Str. 6 Geb. 15k, 86159 Augsburg, Germany | Phone: +49 821 8854401 | Fax: +49 821 8854410 | www.semiodesk.com
This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. Semiodesk GmbH is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.
Dear John,thank you for your quick response. I also think that the standard URI for the eprint_status should be used instead of the solution that was proposed by me. However, I am new to EPrints and do not know which the standard URI actually is.Concerning the subjects: I understand that one could import terms that are not defined in the local vocabulary. However, this is a general problem with using plain literals as identifiers* and not specific to the Atom XML import. The problem also exists when importing EPrints XML datasets. Am I wrong here? If not, then I would suggest to add the support for setting the item type and subjects as I proposed because it does not break anything. It simply generates the equivalent of an EPrints XML dataset.From a user perspective, everybody is happy if the terms are aligned upon submission. If not, a reviewer has the chance of detecting he problem. However, if these terms are entirely left out then reviewers have no chance of finding out what was originally meant which in turn may complicate the reviewing process.Moreover, installing a plugin to enable this feature does not solve the actual problem. Therefore, I think if this feature is used correctly, then it is a chance to make deposits to EPrints repositories more convenient for end-users and reviewers.All the best,Sebastian* Aside from the problem that the same term may refer to a different concept.Semiodesk GmbH | Werner-von-Siemens-Str. 6 Geb. 15k, 86159 Augsburg, Germany | Phone: +49 821 8854401 | Fax: +49 821 8854410 | www.semiodesk.com
This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. Semiodesk GmbH is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.
2016-08-31 16:50 GMT+02:00 John Salter <J.Salter@leeds.ac.uk>:*** Options: http://mailman.ecs.soton.ac.ukHi Sebastian,
Thanks for submitting this patch.
The ‘yomiko’ part is a good catch.
When I export something as Atom, I get these category elements:
<category term="article" label="Article" scheme="http://eprints.whitero
se.ac.uk/data/eprint/type "/><category term="archive" label="Live Archive" scheme="http://eprints.org/ep2
/data/2.0/eprint/eprint_status "/> These are generated here: https://github.com/eprints/epr
ints/blob/3.3/perl_lib/EPrints /Plugin/Export/Atom.pm#L258- L272 
The ‘eprint_status’ one uses the eprints.org namespace – which I think is what should possibly be used instead of ‘yomiko’ [EPrints Services: how does 3.4 (without a default ‘flavour’) handle this?].
The ‘type’ one uses the repository namespace – I think because these can be configured at the repository level.
I have created this pull request: https://github.com/eprints/epr
ints/pull/419 for this.
For the ‘subjects’ part, in EPrints, the subjects field is normally a controlled-value field, based on the ‘subjects’ dataset.
If values added to the subject field don’t exist in the subjects dataset, EPrints doesn’t break – but they will render like this:
?? value ??
– which isn’t normally what is wanted.
By default (in the perl_lib Atom.xsl file), I think it’s safer to *not* map the dcterms:subject into the subjects field (I haven’t done this in the pull request above).
To achieve the improved import of data for Artivity, I would either
(i) make a new XSL import mapping (see warning below!):
~/archives/ARCHIVEID/cfg/plugi
ns/EPrints/Plugin/Import/XSLT/ ArtivityAtom.xsl (change the attribute to ept:name=”Artivity Atom XML”) Or (ii) override the default Atom plugin:
~/archives/ARCHIVEID/cfg/plugi
ns/EPrints/Plugin/Import/XSLT/ Atom.xsl 
!! WARNING !!
I’m not sure how the code here: https://github.com/eprints/epr
ints/blob/3.3/perl_lib/EPrints will behave when there are multiple plugins defined that can handle application/atom+xml imports. If you have two plugins: Atom.xsl and ArtivityAtom.xsl, things might not work. I haven’t tested this (please let us know if you go down this route and it works!)./Plugin/Import/AtomMultipart. pm#L96-L113 
Also, I’ve never over-ridden an XSL plugin. To override perl EPrints plugins this is the way to do it:
https://wiki.eprints.org/w/Ins
tructions_for_local_plugins I’m not sure if you’d need to, or how you would define the ‘plugin_alias_map’ aspect…
Cheers,
John
From: eprints-tech-bounces@ecs.soton
.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk ] On Behalf Of Sebastian Faubel
Sent: 31 August 2016 11:43
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Atom.xsl Patch Submission
Hello everyone,
I have found the reason why the category id and subjects are not recognized when depositing files in E-Prints using the Atom Publishing Protocol. The XSLT stylesheet 'Atom.xsl' [0] in the Import directory does not handle those elements when converting Atom to EPrints XML.
Please find attached a version of the file which handles the dcterms:type and dcterms:subject terms and translates them into E-Prints XML. The dcterms vocabulry seems to be widely used in SWORD protocol implementations (i.e. [1]).
Additionally, I corrected a line in the stylesheet which transforms a submitted eprints status. The line checked for the status being equal to 'MailScanner has detected a possible fraud attempt from "yomiko.ecs.soton.ac.uk80" claiming to be http://yomiko.ecs.soton.ac.uk:
8080/data/eprint/status/ '. It seems to me that this is a concrete EPrints instance, so the line would not work for any other EPrints instance. I changed the line to: 'contains(@scheme,'/eprint/status')'. This should work for all EPrints instances, including my test server. 
Please let me know if you will be including my patch into the repository.
Thank you,
Sebastian
[0] perl_lib/EPrints/Plugin/Import
/XSLT/Atom.xsl [1] http://guides.dataverse.or
g/en/latest/api/sword.html 
Semiodesk GmbH | Werner-von-Siemens-Str. 6 Geb. 15k, 86159 Augsburg, Germany | Phone: +49 821 8854401 | Fax: +49 821 8854410 | www.semiodesk.com
This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. Semiodesk GmbH is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.
/mailman/listinfo/eprints-tech 
*** Archive: http://www.eprints.org/tech.php/ 
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints- tech 
*** Archive: http://www.eprints.org/tech.php/ 
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
- References:
- [EP-tech] Atom.xsl Patch Submission
- From: Sebastian Faubel <sebastian@semiodesk.com>
 
- Re: [EP-tech] Atom.xsl Patch Submission
- From: John Salter <J.Salter@leeds.ac.uk>
 
- Re: [EP-tech] Atom.xsl Patch Submission
- From: Sebastian Faubel <sebastian@semiodesk.com>
 
 
- [EP-tech] Atom.xsl Patch Submission
- Prev by Date: Re: [EP-tech] Seeing unusually high downloads in IRStats
- Next by Date: Re: [EP-tech] request copy error message
- Previous by thread: Re: [EP-tech] Atom.xsl Patch Submission
- Next by thread: [EP-tech] Altmetrics and Connotea
- Index(es):
