EPrints Technical Mailing List Archive

Message: #07878


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] generate_views


On Wed, 5 Jun 2019 at 19:00, Matthew Kerwin <matthew@kerwin.net.au> wrote:
>
> We've put a lot of energy into this problem over the years. I've
> variously: overridden the default Search generation for some
> problematic views with custom SQL queries, and added a wrapper around
> the generate_views script that runs four instances in parallel (one
> for each core on our server).  It still takes a couple of hours every
> week, but it's no longer 12+ hours.
>
> I'm not at work right now to get my hands on the code.  I'll see if I
> can get to it tomorrow.
>
> Cheers
>

For those playing along at home, here's the script we use to run the
generation in parallel:

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fphluid61%2F8817db6a20d217b44cc128ef41e5bd42&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C69140200cc7f4d74020e08d6ea22076d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=SsRxzwtvv5KrSzGC1xytQzD4xCAx3n7VzwzOYEDIlKk%3D&amp;reserved=0

The parallelisation is a bit magical, and requires you to specify the
`MAX_PARALLEL_CHILDREN=x` environment variable.  I've tried to
sanitise it of all our particular quirks, but might have missed
something.

Cheers
-- 
  Matthew Kerwin
  https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmatthew.kerwin.net.au%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C69140200cc7f4d74020e08d6ea22076d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=FpZKEQVOVi2uJqKrHK6%2ByWxkG4blo7kM0ozK9%2F%2FqE%2FQ%3D&amp;reserved=0