[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Antwort: RE: Hyperauthorship



Hi all,

As promised here is the code that I wrote for citation caching:

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints3.4%2Fcommit%2F6aedd1c2b1ba4ce68fceb5cda5&data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=uENm0AgXCUcb7zuVwiNLT8V%2BgJHyi3msvpLEZJ84x%2F0%3D&reserved=0
c16545a72b9d53

You need to copy default_zero/cfg.d/citations.pl to your local archive
and set enabled to 1.  You then need to run epadmin update to create
the Citation dataset.

I have found that on large browse view listing (e.g. 400-500) if the
citations are already cached I get an improvement from a 6 second load
time to 2 seconds.  However, if the citations need caching before the
browse view can be generated then the first load time is 10 seconds.
 However, this will be a one off unless you run the refresh_citations
epadmin command that works like refresh_abstracts but for citations.

Please feel free to try out and ask any questions.   I have done some
basic testing on it but I think it is a little way of deploying in a
production environment.  I would want to be confident that citations
are always cleared when an EPrints is modified, which should always be
the case but might be susceptible to race conditions where the cache is
not cleared in time and the old cached citation is used rather than
generating a new one.

Regards

David Newman

On Thu, 2019-05-16 at 14:35 +0000, Newman D.R. via Eprints-tech wrote:
> Hi Chris,
>
> I have implemented this but it is still under testing to see how much
> it speeds things up.  I will see if I can make this available as a
> branch on GitHub at some point.  However, I seem to be already being
> two jobs at the moment.  So doing interesting EPrints development
> rather than basic additional functionality and bug fixing is a bit of
> a
> luxury time does not afford.
>
> Regards
>
> David Newman
>
> On Thu, 2019-05-16 at 14:25 +0000, Christ?pher Gutteridge via
> Eprints-
> tech wrote:
> >
> > We should have made something long ago which can cache the rendered
> > versions of citations and Export plugins for single items, and
> > invalidated the cache when records are altered or the config is
> > changed... would speed up everything a load.
> > (Sorry, I sketched the idea years ago and never implemented it)
> > On 16/05/2019 15:08, John Salter via Eprints-tech wrote:
> > >
> > > The database takes a big hit for OAI-PMH requests that include
> > > hyper-authored papers.
> > > We have a block of 100 records that contains ~10 ATLAS research
> > > papers - each with 3,000+ authors.
> > > This takes a while to generate the XML response (there's *a lot*
> > > of
> > > nodes that get created).
> > >
> > > I've got this EPScript addition to limit the authors in a
> > > citation
> > > (it's not perfect - I should have used a couple of phrases in
> > > there
> > > - if I was going to share it formally).
> > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2
> > > Fgist.github.com%2Fjesusbagpuss%2Ffbec13d9986fba8e93b56ae5ba34c1&
> > > amp;data=01%7C01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408
> > > d6da0bc70f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=l7wL8
> > > BnRgf8EA7E3SxuGBraA1Y%2BjALC8VfrCLI2H4Mc%3D&reserved=0
> > > 64
> > >
> > > On our summary page we also have the full author list displayed.
> > > For us, the issue we're concerned about is that when we have a
> > > paper with loads of authors, if someone editing the item visits a
> > > workflow stage with the authors on it, it takes *ages* to do
> > > anything.
> > >
> > > Our repo staff want to retain the complete author list - so I'll
> > > continue looking down the 'improved input methods' path rather
> > > than
> > > 'truncate from source' option.
> > >
> > > Cheers,
> > > John
> > >
> > > From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-
> > > bou
> > > nces at ecs.soton.ac.uk] On Behalf Of martin.braendle--- via
> > > Eprints-
> > > tech
> > > Sent: 16 May 2019 14:36
> > > To: John Salter <J.Salter at leeds.ac.uk>
> > > Cc: eprints-tech at ecs.soton.ac.uk
> > > Subject: [EP-tech] Antwort: RE: Hyperauthorship
> > >
> > > Hi,
> > >
> > > we thought of limiting the rendering, too. However, in that case,
> > > the database has to deliver the author records before the limit
> > > is
> > > applied, which involves a performance penalty. Anyone who had to
> > > deal with a 2000 author item in EPrints can tell what this is
> > > like.
> > > That's why we decided to limit on input already.
> > >
> > > Cheers,
> > >
> > > Martin
> > >
> > > "John Salter" ---16.05.2019 14:36:13---Hi Martin, Interesting
> > > approach. The records I'm, looking at all come via Symplectic or
> > > Pure - and w
> > >
> > > Von: "John Salter" <J.Salter at leeds.ac.uk>
> > > An: "martin.braendle at id.uzh.ch" <martin.braendle at id.uzh.ch>,
> > > "eprin
> > > ts-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> > > Datum: 16.05.2019 14:36
> > > Betreff: RE: [EP-tech] Hyperauthorship
> > >
> > >
> > >
> > > Hi Martin,
> > > Interesting approach. The records I'm, looking at all come via
> > > Symplectic or Pure - and we could implement some form of limit to
> > > the number of authors - and retain any that are 'resolved'
> > > (local)
> > > authors.
> > >
> > > I was thinking of changing the default input rendering for the
> > > creator field along these lines:
> > > If there are < LIMIT authors, render input as currently exists
> > > If there are > LIMIT authors, render a static list of them, and
> > > enhance with javascript to allow editing of specific entries /
> > > re-
> > > ordering / searching filtering the list.
> > >
> > > This could even be deployed as a separate workflow stage (which
> > > only appears when there are > LIMIT authors).
> > >
> > > I'll have to see what people here think about limiting the author
> > > list on the way in to EPrints - that sounds like a better place
> > > to
> > > be?
> > >
> > > Cheers,
> > > John
> > >
> > >
> > > From: martin.braendle at id.uzh.ch [mailto:martin.braendle at id.uzh.ch
> > > ]
> > > Sent: 16 May 2019 13:22
> > > To: eprints-tech at ecs.soton.ac.uk; John Salter <J.Salter at leeds.ac.
> > > uk
> > > >
> > > >
> > > Subject: Re: [EP-tech] Hyperauthorship
> > >
> > > Hi John,
> > >
> > > we have a lot of high energy physics or biomedical articles with
> > > hundreds or thousands of authors. Usually, those are submitted
> > > via
> > > CrossRef or PubMed import.
> > >
> > > We have adapted the corresponding import plugins to limit the
> > > number of authors by a configurable limit (in our case 30). If
> > > the
> > > limit is exceeded, "et al" is added as the  ($limit+1)th author,
> > > the remaining authors are not imported and a warning message is
> > > issued. Submitters are then still free to add the remaining UZH
> > > authors manually and use et al for authors outside of UZH.
> > >
> > > Instead of the DOI plugin, we have developed a CrossRef plugin
> > > that
> > > uses the CrossRef REST API . It implements the author limitation
> > > as
> > > well. We decided to go with the CrossRef REST API because funder
> > > information can be imported from there.
> > >
> > > Best regards,
> > >
> > > Martin
> > >
> > > --
> > > Dr. Martin Br?ndle
> > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2
> > > Forcid.org%2F0000-0002-7752-
> > > 6567&amp;data=01%7C01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c
> > > 2b408d6da0bc70f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=
> > > VC8Bwg2BLpo%2BybPatYVIyBvALgwZhIZ4Az4bBUkKXFY%3D&amp;reserved=0
> > > Zentrale Informatik
> > > Universit?t Z?rich
> > > Stampfenbachstr. 73
> > > CH-8006 Z?rich
> > >
> > > "John Salter via Eprints-tech" ---16.05.2019 14:00:41---Hi, Has
> > > anyone done any work on making the EPrints workflow a bit more
> > > sensible when a paper has man
> > >
> > > Von: "John Salter via Eprints-tech" <eprints-tech at ecs.soton.ac.uk
> > > >
> > > An: "'eprints-tech at ecs.soton.ac.uk'" <eprints-tech at ecs.soton.ac.u
> > > k>
> > > Datum: 16.05.2019 14:00
> > > Betreff: [EP-tech] Hyperauthorship
> > > Gesendet von: <eprints-tech-bounces at ecs.soton.ac.uk>
> > >
> > >
> > >
> > >
> > > Hi,
> > > Has anyone done any work on making the EPrints workflow a bit
> > > more
> > > sensible when a paper has many authors (hundreds or thousands)?
> > >
> > > Cheers,
> > > John
> > >
> > >
> > > John Salter
> > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
> > > orcid.org%2F0000-0002-8611-
> > > 8266&amp;data=01%7C01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c
> > > 2b408d6da0bc70f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=
> > > kjC4zZaCbC3FpYg53MlgUmfkiuWdpysY7o4wYMO7noU%3D&amp;reserved=0
> > >
> > > White Rose Libraries Technical Officer
> > > IT - Application Support (Research)
> > > 10.23B, IT Services Building
> > > University of Leeds
> > > Leeds
> > > LS2 9JT
> > > 0113 34 37385
> > >
> > > Online: https://eur03.safelinks.protection.outlook.com/?url=https
> > > %3A%2F%2Fwhiteroselibraries.wordpress.com%2F&amp;data=01%7C01%7Cd
> > > rn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f
> > > 929f44d3ebe89669d03ada9d8%7C0&amp;sdata=9lS51avRu3EUOP9947XrlKQmJ
> > > fQf3KMbb1AuJ87BYuw%3D&amp;reserved=0
> > >
> > > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/epri
> > > nt
> > > s-tech
> > > *** Archive: https://eur03.safelinks.protection.outlook.com/?url=
> > > http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn
> > > %40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f92
> > > 9f44d3ebe89669d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2B
> > > AREOusIkcEIjkf969tzk%3D&amp;reserved=0
> > > *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.ou&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=Dh9DOzt%2BufU%2FhU1EFdm4KuWCe6xrjZ3sIxDtJR2Ycvk%3D&amp;reserved=0
> > > tlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%
> > > 7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a53
> > > 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2B
> > > lVkjLTLheWtP7AMKx2OXs14%3D&amp;reserved=0
> > > *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=4q0AjiY8HH%2Fel9GL5xAwP2%2FiMwQrRPLtQbeSQ62qtTw%3D&amp;reserved=0.
> > > outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C
> > > 01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4
> > > a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmc
> > > vFZ6gli%2BMlgw3bd%2Bd7PMyATg%3D&amp;reserved=0
> > >
> > >
> > >
> > >
> > > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/epri
> > > nt
> > > s-tech
> > > *** Archive: https://eur03.safelinks.protection.outlook.com/?url=
> > > http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn
> > > %40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f92
> > > 9f44d3ebe89669d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2B
> > > AREOusIkcEIjkf969tzk%3D&amp;reserved=0
> > > *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.ou&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=Dh9DOzt%2BufU%2FhU1EFdm4KuWCe6xrjZ3sIxDtJR2Ycvk%3D&amp;reserved=0
> > > tlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%
> > > 7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a53
> > > 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2B
> > > lVkjLTLheWtP7AMKx2OXs14%3D&amp;reserved=0
> > > *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=4q0AjiY8HH%2Fel9GL5xAwP2%2FiMwQrRPLtQbeSQ62qtTw%3D&amp;reserved=0.
> > > outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C
> > > 01%7Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4
> > > a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmc
> > > vFZ6gli%2BMlgw3bd%2Bd7PMyATg%3D&amp;reserved=0
> > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprint
> > s-
> > tech
> > *** Archive: https://eur03.safelinks.protection.outlook.com/?url=ht
> > tp%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn%40e
> > cs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44d3
> > ebe89669d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2BAREOusIk
> > cEIjkf969tzk%3D&amp;reserved=0
> > *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.outl&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=NsRMWuc4N653XUVr3Llakj7%2FVDAO7W8QdCuAg6PmlIg%3D&amp;reserved=0
> > ook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Cdr
> > n%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929
> > f44d3ebe89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2BlVkjLTLh
> > eWtP7AMKx2OXs14%3D&amp;reserved=0
> > *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.ou&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=Dh9DOzt%2BufU%2FhU1EFdm4KuWCe6xrjZ3sIxDtJR2Ycvk%3D&amp;reserved=0
> > tlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7
> > Cdrn%40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f
> > 929f44d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmcvFZ6gli%
> > 2BMlgw3bd%2Bd7PMyATg%3D&amp;reserved=0
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-
> tech
> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http
> %3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Cdrn%40ecs.s
> oton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44d3ebe896
> 69d03ada9d8%7C0&amp;sdata=3dwAZ3WMVCcJQSfKyX%2FCHM%2BAREOusIkcEIjkf96
> 9tzk%3D&amp;reserved=0
> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.outloo&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=b4PVk3CXYgGUJ1fJxWZF2BWDCmR3d1m1q%2FqZS%2BsfFNU%3D&amp;reserved=0
> k.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Cdrn%40
> ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44d3e
> be89669d03ada9d8%7C0&amp;sdata=bAAmMyyLM3BKMyaJKV0%2BlVkjLTLheWtP7AMK
> x2OXs14%3D&amp;reserved=0
> *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur03.safelinks.protection.outl&amp;data=01%7C01%7C%7C5c338c8f81824280ae7c08d6e5b743fa%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=NsRMWuc4N653XUVr3Llakj7%2FVDAO7W8QdCuAg6PmlIg%3D&amp;reserved=0
> ook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7Cdrn
> %40ecs.soton.ac.uk%7C17a35c5698f24685c2b408d6da0bc70f%7C4a5378f929f44
> d3ebe89669d03ada9d8%7C0&amp;sdata=x1u8ms27jNFOjBeMVmcvFZ6gli%2BMlgw3b
> d%2Bd7PMyATg%3D&amp;reserved=0