[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Antwort: RE: Hyperauthorship



I would suggest maybe calling it CitationCache to make it less confusing
as much of the documentation uses "citation" to refer to the citation
configuration files.

I seem to recall from when I  looked into this years ago that there's a
few citations that should not be cached as they have contextual
information. eg.
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Flib%2Fcitations%2Feprint%2Fissue.xml&data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=SIJxC0DzKjWsVI7Y8hW%2FUZYgX3mGxvQeLPWBbBCDIHs%3D&reserved=0

On 31/05/2019 12:01, Newman D.R. wrote:
> Hi all,
>
> As promised here is the code that I wrote for citation caching:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints3.4%2Fcommit%2F6aedd1c2b1ba4ce68fceb5cda5&data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=kU7sDVp%2B4z599BF1OGhQojhjffco96%2FqycSdblwv9nE%3D&reserved=0
> c16545a72b9d53
>
> You need to copy default_zero/cfg.d/citations.pl to your local archive
> and set enabled to 1.  You then need to run epadmin update to create
> the Citation dataset.
>
> I have found that on large browse view listing (e.g. 400-500) if the
> citations are already cached I get an improvement from a 6 second load
> time to 2 seconds.  However, if the citations need caching before the
> browse view can be generated then the first load time is 10 seconds.
>   However, this will be a one off unless you run the refresh_citations
> epadmin command that works like refresh_abstracts but for citations.
>
> Please feel free to try out and ask any questions.   I have done some
> basic testing on it but I think it is a little way of deploying in a
> production environment.  I would want to be confident that citations
> are always cleared when an EPrints is modified, which should always be
> the case but might be susceptible to race conditions where the cache is
> not cleared in time and the old cached citation is used rather than
> generating a new one.
>
> Regards
>
> David Newman
>
> On Thu, 2019-05-16 at 14:35 +0000, Newman D.R. via Eprints-tech wrote:
>> Hi Chris,
>>
>> I have implemented this but it is still under testing to see how much
>> it speeds things up.  I will see if I can make this available as a
>> branch on GitHub at some point.  However, I seem to be already being
>> two jobs at the moment.  So doing interesting EPrints development
>> rather than basic additional functionality and bug fixing is a bit of
>> a
>> luxury time does not afford.
>>
>> Regards
>>
>> David Newman
>>
>> On Thu, 2019-05-16 at 14:25 +0000, Christ?pher Gutteridge via
>> Eprints-
>> tech wrote:
>>> We should have made something long ago which can cache the rendered
>>> versions of citations and Export plugins for single items, and
>>> invalidated the cache when records are altered or the config is
>>> changed... would speed up everything a load.
>>> (Sorry, I sketched the idea years ago and never implemented it)
>>> On 16/05/2019 15:08, John Salter via Eprints-tech wrote:
>>>> The database takes a big hit for OAI-PMH requests that include
>>>> hyper-authored papers.
>>>> We have a block of 100 records that contains ~10 ATLAS research
>>>> papers - each with 3,000+ authors.
>>>> This takes a while to generate the XML response (there's *a lot*
>>>> of
>>>> nodes that get created).
>>>>
>>>> I've got this EPScript addition to limit the authors in a
>>>> citation
>>>> (it's not perfect - I should have used a couple of phrases in
>>>> there
>>>> - if I was going to share it formally).
>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2
>>>> Fgist.github.com%2Fjesusbagpuss%2Ffbec13d9986fba8e93b56ae5ba34c1&
>>>> amp;data=01%7C01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408
>>>> d6da0bc70f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=l7wL8
>>>> BnRgf8EA7E3SxuGBraA1Y%2BjALC8VfrCLI2H4Mc%3D&reserved=0
>>>> 64
>>>>
>>>> On our summary page we also have the full author list displayed.
>>>> For us, the issue we're concerned about is that when we have a
>>>> paper with loads of authors, if someone editing the item visits a
>>>> workflow stage with the authors on it, it takes *ages* to do
>>>> anything.
>>>>
>>>> Our repo staff want to retain the complete author list - so I'll
>>>> continue looking down the 'improved input methods' path rather
>>>> than
>>>> 'truncate from source' option.
>>>>
>>>> Cheers,
>>>> John
>>>>
>>>> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-
>>>> bou
>>>> nces at ecs.soton.ac.uk] On Behalf Of martin.braendle--- via
>>>> Eprints-
>>>> tech
>>>> Sent: 16 May 2019 14:36
>>>> To: John Salter <J.Salter at leeds.ac.uk>
>>>> Cc: eprints-tech at ecs.soton.ac.uk
>>>> Subject: [EP-tech] Antwort: RE: Hyperauthorship
>>>>
>>>> Hi,
>>>>
>>>> we thought of limiting the rendering, too. However, in that case,
>>>> the database has to deliver the author records before the limit
>>>> is
>>>> applied, which involves a performance penalty. Anyone who had to
>>>> deal with a 2000 author item in EPrints can tell what this is
>>>> like.
>>>> That's why we decided to limit on input already.
>>>>
>>>> Cheers,
>>>>
>>>> Martin
>>>>
>>>> "John Salter" ---16.05.2019 14:36:13---Hi Martin, Interesting
>>>> approach. The records I'm, looking at all come via Symplectic or
>>>> Pure - and w
>>>>
>>>> Von: "John Salter" <J.Salter at leeds.ac.uk>
>>>> An: "martin.braendle at id.uzh.ch" <martin.braendle at id.uzh.ch>,
>>>> "eprin
>>>> ts-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
>>>> Datum: 16.05.2019 14:36
>>>> Betreff: RE: [EP-tech] Hyperauthorship
>>>>
>>>>
>>>>
>>>> Hi Martin,
>>>> Interesting approach. The records I'm, looking at all come via
>>>> Symplectic or Pure - and we could implement some form of limit to
>>>> the number of authors - and retain any that are 'resolved'
>>>> (local)
>>>> authors.
>>>>
>>>> I was thinking of changing the default input rendering for the
>>>> creator field along these lines:
>>>> If there are < LIMIT authors, render input as currently exists
>>>> If there are > LIMIT authors, render a static list of them, and
>>>> enhance with javascript to allow editing of specific entries /
>>>> re-
>>>> ordering / searching filtering the list.
>>>>
>>>> This could even be deployed as a separate workflow stage (which
>>>> only appears when there are > LIMIT authors).
>>>>
>>>> I'll have to see what people here think about limiting the author
>>>> list on the way in to EPrints - that sounds like a better place
>>>> to
>>>> be?
>>>>
>>>> Cheers,
>>>> John
>>>>
>>>>
>>>> From: martin.braendle at id.uzh.ch [mailto:martin.braendle at id.uzh.ch
>>>> ]
>>>> Sent: 16 May 2019 13:22
>>>> To: eprints-tech at ecs.soton.ac.uk; John Salter <J.Salter at leeds.ac.
>>>> uk
>>>>>
>>>> Subject: Re: [EP-tech] Hyperauthorship
>>>>
>>>> Hi John,
>>>>
>>>> we have a lot of high energy physics or biomedical articles with
>>>> hundreds or thousands of authors. Usually, those are submitted
>>>> via
>>>> CrossRef or PubMed import.
>>>>
>>>> We have adapted the corresponding import plugins to limit the
>>>> number of authors by a configurable limit (in our case 30). If
>>>> the
>>>> limit is exceeded, "et al" is added as the  ($limit+1)th author,
>>>> the remaining authors are not imported and a warning message is
>>>> issued. Submitters are then still free to add the remaining UZH
>>>> authors manually and use et al for authors outside of UZH.
>>>>
>>>> Instead of the DOI plugin, we have developed a CrossRef plugin
>>>> that
>>>> uses the CrossRef REST API . It implements the author limitation
>>>> as
>>>> well. We decided to go with the CrossRef REST API because funder
>>>> information can be imported from there.
>>>>
>>>> Best regards,
>>>>
>>>> Martin
>>>>
>>>> --
>>>> Dr. Martin Br?ndle
>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2
>>>> Forcid.org%2F0000-0002-7752-
>>>> 6567&amp;data=01%7C01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c
>>>> 2b408d6da0bc70f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=
>>>> VC8Bwg2BLpo%2BybPatYVIyBvALgwZhIZ4Az4bBUkKXFY%3D&amp;reserved=0
>>>> Zentrale Informatik
>>>> Universit?t Z?rich
>>>> Stampfenbachstr. 73
>>>> CH-8006 Z?rich
>>>>
>>>> "John Salter via Eprints-tech" ---16.05.2019 14:00:41---Hi, Has
>>>> anyone done any work on making the EPrints workflow a bit more
>>>> sensible when a paper has man
>>>>
>>>> Von: "John Salter via Eprints-tech" <eprints-tech at ecs.soton.ac.uk
>>>> An: "'eprints-tech at ecs.soton.ac.uk'" <eprints-tech at ecs.soton.ac.u
>>>> k>
>>>> Datum: 16.05.2019 14:00
>>>> Betreff: [EP-tech] Hyperauthorship
>>>> Gesendet von: <eprints-tech-bounces at ecs.soton.ac.uk>
>>>>
>>>>
>>>>
>>>>
>>>> Hi,
>>>> Has anyone done any work on making the EPrints workflow a bit
>>>> more
>>>> sensible when a paper has many authors (hundreds or thousands)?
>>>>
>>>> Cheers,
>>>> John
>>>>
>>>>
>>>> John Salter
>>>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
>>>> orcid.org%2F0000-0002-8611-
>>>> 8266&amp;data=01%7C01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c
>>>> 2b408d6da0bc70f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=
>>>> kjC4zZaCbC3FpYg53MlgUmfkiuWdpysY7o4wYMO7noU%3D&amp;reserved=0
>>>>
>>>> White Rose Libraries Technical Officer
>>>> IT - Application Support (Research)
>>>> 10.23B, IT Services Building
>>>> University of Leeds
>>>> Leeds
>>>> LS2 9JT
>>>> 0113 34 37385
>>>>
>>>> Online: https://eur03.safelinks.protection.outlook.com/?url=https
>>>> %3A%2F%2Fwhiteroselibraries.wordpress.com%2F&amp;data=01%7C01%7Cd
>>>> rn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f
>>>> 929f44d3ebe89669d03ada9d8%7C0&amp;sdata=9lS51avRu3EUOP9947XrlKQmJ
>>>> fQf3KMbb1AuJ87BYuw%3D&amp;reserved=0
>>>>
>>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/epri
>>>> nt
>>>> s-tech
>>>> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=
>>>> http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn
>>>> %40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f92
>>>> 9f44d3ebe89669d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2B
>>>> AREOusIkcEIjkf969tzk%3D&amp;reserved=0
>>>> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.ou&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=lzvM4Ui3qMJbsHKnCW3cZ%2BN1y6Dfv5znwA701bv%2BX9o%3D&amp;reserved=0
>>>> tlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%
>>>> 7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a53
>>>> 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2B
>>>> lVkjLTLheWtP7AMKx2OXs14%3D&amp;reserved=0
>>>> *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=UYrhQ9F3C0%2FqIWvkIh4qYxkF2Q6iXfOCH3NLfUqdu%2Bw%3D&amp;reserved=0.
>>>> outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C
>>>> 01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4
>>>> a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmc
>>>> vFZ6gli%2BMlgw3bd%2Bd7PMyATg%3D&amp;reserved=0
>>>>
>>>>
>>>>
>>>>
>>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/epri
>>>> nt
>>>> s-tech
>>>> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=
>>>> http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn
>>>> %40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f92
>>>> 9f44d3ebe89669d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2B
>>>> AREOusIkcEIjkf969tzk%3D&amp;reserved=0
>>>> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.ou&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=lzvM4Ui3qMJbsHKnCW3cZ%2BN1y6Dfv5znwA701bv%2BX9o%3D&amp;reserved=0
>>>> tlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%
>>>> 7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a53
>>>> 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2B
>>>> lVkjLTLheWtP7AMKx2OXs14%3D&amp;reserved=0
>>>> *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=UYrhQ9F3C0%2FqIWvkIh4qYxkF2Q6iXfOCH3NLfUqdu%2Bw%3D&amp;reserved=0.
>>>> outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C
>>>> 01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4
>>>> a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmc
>>>> vFZ6gli%2BMlgw3bd%2Bd7PMyATg%3D&amp;reserved=0
>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprint
>>> s-
>>> tech
>>> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=ht
>>> tp%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn%40e
>>> cs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44d3
>>> ebe89669d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2BAREOusIk
>>> cEIjkf969tzk%3D&amp;reserved=0
>>> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.outl&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=ue9ZLU8noAeDvzfTYYxhf%2F72CdEUvT7VhyGazqd0CaI%3D&amp;reserved=0
>>> ook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Cdr
>>> n%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929
>>> f44d3ebe89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2BlVkjLTLh
>>> eWtP7AMKx2OXs14%3D&amp;reserved=0
>>> *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.ou&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=lzvM4Ui3qMJbsHKnCW3cZ%2BN1y6Dfv5znwA701bv%2BX9o%3D&amp;reserved=0
>>> tlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7
>>> Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f
>>> 929f44d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmcvFZ6gli%
>>> 2BMlgw3bd%2Bd7PMyATg%3D&amp;reserved=0
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-
>> tech
>> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http
>> %3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn%40ecs.s
>> oton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44d3ebe896
>> 69d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2BAREOusIkcEIjkf96
>> 9tzk%3D&amp;reserved=0
>> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.outloo&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=fr8vLovK0AVfo5Nca8%2FEep5ZXrLYeL1gShbBuPrcbCE%3D&amp;reserved=0
>> k.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Cdrn%40
>> ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44d3e
>> be89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2BlVkjLTLheWtP7AMK
>> x2OXs14%3D&amp;reserved=0
>> *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.outl&amp;data=01%7C01%7C%7Cdcea7f66450f41b7810a08d6e5b85b53%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=ue9ZLU8noAeDvzfTYYxhf%2F72CdEUvT7VhyGazqd0CaI%3D&amp;reserved=0
>> ook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7Cdrn
>> %40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44
>> d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmcvFZ6gli%2BMlgw3b
>> d%2Bd7PMyATg%3D&amp;reserved=0