[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: load throttling strategy

For several years we have used generate_views to recreate our view pages once per week, with them set not to expire. A number of our views are not well set up and have very long list pages; generate_views takes a long time to process but it?s better than doing it on-demand.

Like Seb, I don?t think the views are a very practical discovery method. Our Google Urchin statistics show that about 16% of page views are view pages and 79% of those are for the author/creator views so most views are barely used. Some are valuable to various groups, e.g., the organisational structure views, and we need to continue presenting these views of the eprints.

In QUT ePrints a paginated search for a school with 2000+ records returns quickly but the corresponding view would take a long time to generate. I wonder whether, in the future, it would be an improvement to replace the backend of the views by a search with pagination and facets.


Mark Gregson | Applications and Development Team Leader
Library eServices | Queensland University of Technology
Level 3 | R Block | Kelvin Grove Campus | GPO Box 2434 | Brisbane 4001
Phone: +61 7 3138 3782 | Web: http://eprints.qut.edu.au/<http://www.qut.edu.au/>
ABN: 83 791 724 622
CRICOS No: 00213J

From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of sf2
Sent: Wednesday, 17 December 2014 2:26 AM
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] Re: load throttling strategy

I don't reckon the "waiting" strategy is a good idea: 1- there's no indication that server load will get better after waiting for n seconds, 2- if your thread is waiting, it can't take new connections (so clients will be pilling up).

One strategy (given RG's issues) is to disable the on-demand regeneration of such pages and only generate them offline (via generate_views). Then no problems for the clients since eprints/apache will only be serving cached pages (cached on-disk that is). And if you really must, set-up Varnish or else in front of your repo...

If a page takes 10mins to regenerate then having it generated on-demand by a client cannot be a good idea ;-)

Also out-of-interest I'd be curious to know of any stats showing that visitors actually use the browse pages (ie. how often/how much). I kinda see the point of having them for crawlers (then just have one browse view, eg per year) but for users... meh :-)


On 16.12.2014 10:11, Ian Stuart wrote:

On 16/12/14 10:05, Yuri wrote:
The best is to check the system load in the build page plugin/module, wait some seconds, and then go. Is there some documentation somewhere on Eprints strategies on views page rebuilds?

The only thing I'm aware one can do is define the number of days

view-pages are considered "valid" for, before being automatically rebuilt.


Ian Stuart.

Developer: ORI, RJ-Broker, and OpenDepot.org

Bibliographics and Multimedia Service Delivery team,


The University of Edinburgh.


This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in

Scotland, with registration number SC005336.

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech

*** Archive: http://www.eprints.org/tech.php/

*** EPrints community wiki: http://wiki.eprints.org/

*** EPrints developers Forum: http://forum.eprints.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20141217/6d078c82/attachment-0001.html