[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Antwort: Speed up citation



Dear List,

in addition to my last post, we did a pre-render_citation() for all EPrints
and stored the result in each EPrint DB-field "citation". Then we did a
little benchmark, comparing the time we need to Export the DB-field vs.
render_citation() on the fly. Here are the results on different
collections:

- collection id 11124 (ca. 100 EPrints): 	buildlist 0.004 sec.,
render_citation() 1.4 sec., 	DB-Field citation 0.19 sec
- collection id 10046 (ca. 1000 EPrints): 	buildlist 0.02 sec.,
render_citation() 13 sec., 	DB-Field citation 1.4 sec
- collection id 10170 (ca. 25000 EPrints): 	buildlist 0.39 sec.,
render_citation() 357.9 sec., 	DB-Field citation 38.9 sec

All in all we're 10-12 times faster exporting pre-generated citation lists.
So we're doing an initial pre-fill of DB-field "citation" and we have ToDo
an update within every workflow edit step.
Does anybody has any comments, experiance or hints, to speed up
render_citation in another way? Does anybody use COINS in a similar way?

Cheers
 Jens

--
Jens Vieler
Informatikdienste
Universit?t Z?rich
Stampfenbachstrasse 73
CH-8006 Z?rich

mail:  jens.vieler at id.uzh.ch
phone: +41 44 63 56777
http://www.id.uzh.ch



Von:	Jens-Patrick Vieler/at/UZH
An:	eprints-tech at ecs.soton.ac.uk
Datum:	24.02.2016 13:51
Betreff:	Speed up citation


Dear List

we're building a new cgi/export-plugin solution to support publication
lists in our web. Doing some searches, building some lists,
remainder/union/intersect them together, and finaly, we export a kind of
XML, including metadata and a citation.
Actually everything works quite well, BUT if the result turns into long
lists, we are running into performance problems.

We did some benchmarking and here is the result over a typically
1000-item-list:

- building up the result needs 15 sec.
- search and generate lists 0.015sec
- merge lists 0.08sec
- generation of export data 14sec

congrats: dealing with search and lists is very fast within eprints :-)

so we took a closer look at what happens while building the output.
first of all: it grows linearly with the use of render_citation.

1 time   '$citation = EPrints::Utils::tree_to_utf8($dataobj->
render_citation("default"));' takes 14sec
2 times '$citation = EPrints::Utils::tree_to_utf8($dataobj->render_citation
("default"));' takes 29sec
4 times '$citation = EPrints::Utils::tree_to_utf8($dataobj->render_citation
("default"));' takes 54sec

second: it speeds up while reducing the citation XML file (default) to a
minimum; when it only includes the title, the export needs 2sec for 1000
items.

So my question is: Is there a way to speed up render_citation? Does it
always interpret the whole citation XML file? Has anybody thought about a
way to compile the XML to a perl routine or cache things like this?

Any help is welcome
 Jens


--
Jens Vieler
Informatikdienste
Universit?t Z?rich
Stampfenbachstrasse 73
CH-8006 Z?rich

mail:  jens.vieler at id.uzh.ch
phone: +41 44 63 56777
http://www.id.uzh.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20160316/6cc2731e/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20160316/6cc2731e/attachment.gif