EPrints Technical Mailing List Archive

Message: #07770

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] ORCIDs - Crosswalks - EPrints

Morning All,

There's been a small problem on the repository since I started that I decided to investigate yesterday. It involves the duplication of ORCIDS and their associated email addresses. Sometimes they're duplicated in the database and in the metadata of an eprint. This prevents the ORCID hyperlink from working and prevents the small ORCID logo from appearing in the EPrint citation.

To illustrate:

I expect in the database - 0000-0002-5069-1909

I get in the database - 0000-0002-5069-19090000-0002-5069-19090000-0002-5069-1909

I've identified the problem as Elements-Crosswalks related. The crosswalks take every instance of an ORCID and user email address from each source and then squash them together in their respective fields.

For example:

feed --> pubs:users - pubs:user - > pubs:identifiers -> pubs:identifier scheme="orcid"

and then it also grabs the ORCID this from the elements sources where it exists (in this case from epmc).

I understand why it has to do it this way, because some sources have all authors ORCIDs and some don't, so they all need to be picked up. I think my solution is to alter the crosswalks behaviour.

Before I start down this route, has anybody encountered this previously and is there a better/easier way to fix it?

Regarding fixing the database, I feel incredibly uncomfortable going in to an EPrints table and directly altering stuff. Is there a way I can fix the already duplicated ORCIDs? I did ask our Elements team if the data can be prodded to re-send from Elements through  the crosswalks and into EPrints, but it looks like a no go.

I was using version 1.3 of the ORCID support plugin, clicked upgrade, got an error and now it appears to be 1.7, but I don't think my problem is anything to do with the plugin.