[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] non working DataciteDoi plugin (Recollect installed)



The "fixed" branch works OOTB, while the bazaar plugin not. My point was 
to make it works OOTB, or put some disclaimer or guide on how to make it 
work on a site.

Maybe the problem was not the order but that z_datacite_mapping.pl has this:

$c->{datacite_mapping_data_type} = sub {

 ??? my($xml, $dataobj, $repo, $value) = @_;

 ??? return $xml->create_data_element("resourceType", $value, 
resourceTypeGeneral=>$value);
};

and also this:

$c->{datacite_mapping_type} = sub {

 ??? my($xml, $dataobj, $repo, $value) = @_;

 ??? my $pub_resourceType = $repo->get_conf("datacitedoi", "typemap", 
$value);
 ??? if (defined $pub_resourceType) {
 ??????? return $xml->create_data_element("resourceType", 
$pub_resourceType->{'v'}, resourceTypeGeneral=>$pub_resourceType->{'a'});
 ??? }

 ??? return undef;
};

so both "type" and "data_type" fields (which always exists in Eprints) 
lead to a double resourceType xml field thus it does not validate. Also 
contributors is mandatory, but z_mapping misses a 
$c->{datacite_mapping_contributors} entry.

I think it should not be so difficult to fix it (remove one of type 
mapping and implement the mandatory fields) to make it work OOTB, so 
people can just install and use the plugin using bazaar.

Thanks for your explanations, they're very useful to me for implementing 
it for my site. Below other ideas:

Il 04/04/2018 13:25, martin.braendle at id.uzh.ch ha scritto:
>
> Hi Yuri,
>
> the two plugins work quite differently.
>
> The FIXED eprints branch supports only DataCite Metadata Schema 2.2, 
> see 
> https://github.com/eprints/datacite/blob/fixed/cfg/cfg.d/z_datacitedoi.pl
> It has operates on a fix set of fields.
>
> The EprintsUG plugin supports DataCite Metadata Schema 4.0 . It 
> supports any set of fields and must be adapted to a specific repo. The 
> fields in a repo are mapped to methods that must be specified in 
> lib/cfg.d/z_datacite_mapping.pl 
> (https://github.com/eprintsug/DataCiteDoi/blob/master/lib/cfg.d/z_datacite_mapping.pl?)
>
> This algorithm has its pros and cons:
> - pro: very versatile and modular, the Export plugin itself must not 
> be modified. Instead a config file can be adapted.
> - con: The loop over all fields is inefficient 
> (https://github.com/eprintsug/DataCiteDoi/blob/64956b6d4b461159ac2ae35df14105cff4a86171/lib/plugins/EPrints/Plugin/Export/DataCiteXML.pm#L62-L68?). 
> For DataCite Metadata Schema, there are 19 elements. A large repo may 
> have 100 fields or more. 80% of the loop are just overhead. Imagine 
> having to spend 80% overhead for exporting a repo of 100'000 records.
>

I agree, but instead of loop over all the fields, just put the datacite 
metadata fields one in a config too. I found wrong to start from the 
eprints fields, better to have a fixed list of datacite fields and loop 
over them, reading values from eprints fields. Also, in Eprints, exports 
can be created as text on saving and cached in the filesystem, I think.

> - con: dependencies between fields ?are not considered; workarounds in 
> the mapping methods must be implemented. This gets especially 
> problematic if either of two or more fields map to the same DataCite 
> element, but the field values have to be inserted in nested sub-elements.
>

The same apply here, start from the datacite list of metadata fields.

> - as you pointed out, there is no order of fields. While the 
> description of the DataCite Metadata Schema 
> (https://schema.datacite.org/meta/kernel-4.1/doc/DataCite-MetadataKernel_v4.1.pdf?) 
> lists them in a specific order, the schema itself 
> (https://schema.datacite.org/meta/kernel-4.1/metadata.xsd?) requires 
> no specific order. However, for readability of the output, I would 
> have prefered the order as outlined in the PDF.
>

The same here, starting from the datacite list of metadata fields make 
it possible to decide the exact order.

> - there are no checks against mandatory fields
>

Also there's no check on success of coining DOIs. I think I'll send an 
email report to the user and the repository manager when a DOI is 
created or if there's an error (Datacite not answering, wrong metadata, 
etc etc).

>
> I have just recently implemented the EprintsUG DataCite XML export 
> plugin to our repository and also adapted to DataCite Metadata Schema 
> 4.1. It should cover all DataCite elements. We need it to produce SIPs 
> for a long-term archive project (DLCM) as well as parts of it for 
> exporting Funding data to OpenAire (Currently, funding information is 
> not displayed on the production system, but that will come soon).
> Example output: 
> http://www.zora.uzh.ch/cgi/export/eprint/150598/DataCiteXML/zora-eprint-150598.xml
>
> Not sure if my code will help - some parts are highly 
> repository-implementation specific (e.g. document types, language 
> codes (ISO639-3) etc., funder data model which is already DataCite 
> compatible and further).
>
> Best regards,
>
> Martin
>
> --
> Dr. Martin Br?ndle
> Zentrale Informatik
> Universit?t Z?rich
> Stampfenbachstr. 73
> CH-8006 Z?rich
>
> mail: martin.braendle at id.uzh.ch
> phone: +41 44 63 56705
> fax: +41 44 63 54505
> http://www.zi.uzh.ch
>
> Inactive hide details for Yuri ---03.04.2018 11:06:42---Hi! ?this 
> version should works:Yuri ---03.04.2018 11:06:42---Hi! ??this version 
> should works:
>
> Von: Yuri <yurj at alfa.it>
> An: <eprints-tech at ecs.soton.ac.uk>
> Datum: 03.04.2018 11:06
> Betreff: Re: [EP-tech] non working DataciteDoi plugin (Recollect 
> installed)
> Gesendet von: eprints-tech-bounces at ecs.soton.ac.uk
>
> ------------------------------------------------------------------------
>
>
>
> Hi!
>
> ?this version should works:
>
> https://github.com/eprints/datacite/blob/fixed/lib/plugins/EPrints/Plugin/Export/DataCiteXML.pm
> (note the FIXED name of the branch ;-) )
>
> while this:
>
> https://github.com/eprintsug/DataCiteDoi/blob/master/lib/plugins/EPrints/Plugin/Export/DataCiteXML.pm
>
> lead to a malformed datacite (but aim to support all the field, maybe
> they use exactly the datacite metadata??)
>
> Please, can someone fix bazaar including the working DOI plugin? I think
> the eprintsug works for a particular site, while eprints/datacite branch
> fixed should work in almost every 3.3.15 eprints site.
>
> Do you agree?
>
>
> Il 30/03/2018 11:41, Yuri ha scritto:
> > Hi!
> >
> > ? ?I'm using Recollect Plugin together with DataCiteDoi, both from
> > bazaar, Eprints 3.3.15
> >
> > ? ?I'm wondering how this plugin can work. It totally misses, for
> > example, contributors which are *mandatory* for Datacite MDS v4. When
> > calling the MDS api, I get:
> >
> > The content of element 'resource' is not complete. One of
> > '{"http://datacite.org/schema/kernel-4":publisher,
> > "http://datacite.org/schema/kernel-4":contributors,
> > "http://datacite.org/schema/kernel-4":dates,
> > "http://datacite.org/schema/kernel-4":language,
> > "http://datacite.org/schema/kernel-4":alternateIdentifiers,
> > "http://datacite.org/schema/kernel-4":relatedIdentifiers,
> > "http://datacite.org/schema/kernel-4":sizes,
> > "http://datacite.org/schema/kernel-4":formats,
> > "http://datacite.org/schema/kernel-4":version,
> > "http://datacite.org/schema/kernel-4":fundingReferences}' is expected.
> >
> > I also had to comment out the "type" mapping in
> > lib/cfg.d/z_datacite_mapping.pl because it got inserted on the top (*).
> > Luckly, Recollect has a "data_type" which get correctly mapped on
> > resourceType:
> >
> > |<resourceType resourceTypeGeneral="Dataset">Dataset</resourceType>|
> >
> >
> > So, I need some help if you've been able to make it work, I'm a little
> > confused.
> >
> >
> > (*) another issue is the order of the field, given by:
> >
> > foreach my $field ( $dataobj->{dataset}->get_fields) in
> > lib/plugins/EPrints/Plugin/Export/DataCiteXML.pm
> >
> > Being it a standard with known fields, to avoid errors, shouldn't be it
> > a fixed and ordered list of fields?
> >
> >
> > (**) Also, isn't the test server test.mds.datacite.org and not
> > test.datacite.org/mds? If so, the z_datacitedoi.pl can be updated with
> > the correct url?
> >
> >
> > *** Options: 
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> > *** Archive: http://www.eprints.org/tech.php/
> > *** EPrints community wiki: http://wiki.eprints.org/
> > *** EPrints developers Forum: http://forum.eprints.org/
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
>
>
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/