EPrints Technical Mailing List Archive

Message: #08499


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Antwort: Re: [ext] Re: [3.4.2] Wrong status of indexer on the admin site


CAUTION: This e-mail originated outside the University of Southampton.

We'd be interested in this chat too.

We're looking at possibly moving thumbnailing processes to Azure for our Digital Library (using blob storage / triggers when new things are added).

 

Cheers,

John

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of David R Newman via Eprints-tech
Sent: 27 January 2021 12:12
To: eprints-tech@ecs.soton.ac.uk; martin.braendle@uzh.ch; systems <systems@mdc-berlin.de>
Subject: Re: [EP-tech] Antwort: Re: [ext] Re: [3.4.2] Wrong status of indexer on the admin site

 

Hi Martin,

Although getting EPrints to work nicely in a distributed environment would be a great thing to have, it feels like a Herculean task, as there are so many aspects of EPrints that would need to be considered.  It would certainly be a very interesting project that I would like to be involved with if time permits.  However, it may be better led by someone like yourself who has had greater experience adapting your EPrints repository into one that works in a somewhat distributed manner.

I am happy to get together for a online chat (e.g. Zoom, Teams, Jitsi, etc.) to discuss how we might take this forward.  It would be good to have other people involved in this initial chat (although experience tells me that many more than six people in a video call starts to get a bit manic).  Feel free to suggest some times and dates you are free for such a discussion and we can figure out how/when to do this.

Regards

David Newman

On 27/01/2021 11:32, Martin Braendle via Eprints-tech wrote:

CAUTION: This e-mail originated outside the University of Southampton.

Hi David,

interesting thread. In our case, we have a three-server setup running on a shared file system.
The indexer runs on a compute node that also carries out all the cronjobs so that the web servers are free for their task to serve to the Web.
Calling cgi/counter on one of the web servers will always tell us that there is no indexer running because there is no matching process id from the indexer.pid file on the compute node.
Also, starting/stopping the indexer would just start another instance on the web server node  (and probably raise a conflict in the indexer.pid file). So we never do that this way, but via console or scripts (e.g. in the case of automated backups of the Xapian index).

Which brings us to the topic that EPrints never really has been made for a distributed environment - any plans that this will change in the future?

Kind regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich


Inactive hide details for "systems via
            Eprints-tech" ---27/01/2021 12:02:49---CAUTION: This
            e-mail originated outside the Unive"systems via Eprints-tech" ---27/01/2021 12:02:49---CAUTION: This e-mail originated outside the University of Southampton. Hi David,

Von: "systems via Eprints-tech" <eprints-tech@ecs.soton.ac.uk>
An: <eprints-tech@ecs.soton.ac.uk>
Datum: 27/01/2021 12:02
Betreff: Re: [EP-tech] [ext] Re: [3.4.2] Wrong status of indexer on the admin site
Gesendet von: <eprints-tech-bounces@ecs.soton.ac.uk>





CAUTION: This e-mail originated outside the University of Southampton.

Hi David,
I only see the button "start indexer".
The page says that the indexer is stopped, but the indexer process is alive.

ps axu | grep $(cat /usr/share/eprints/var/indexer.pid)
eprints     6134  0.0  2.7 210776 50992 ?        Ss   11:06   0:00
indexer

The indexer.tick exits and will contain:

cat /usr/share/eprints/var/indexer.tick
# This file is by the indexer to indicate
# that the indexer is still is_running.

When I stop the idexer via
systemctl stop epindexer

The pid and tick file is gone and no inder process is running now.
And the log if it says:
[Wed Jan 27 11:25:18 2021] 6134 *** Indexer sub-process stopped
[Wed Jan 27 11:25:18 2021] 6134 ** Indexer process stopping

It looks like, when the indexer is starte via the command line, then the
gui don't detect it and vice versa.

After starting it from the gui, the gui say the it will runs, but not
the console.

systemctl status epindexer
● epindexer.service - The eprints indexer
   Loaded: loaded (/usr/lib/systemd/system/epindexer.service; static;
vendor preset: disabled)
   Active: inactive (dead)



Am 27.01.21 um 10:42 schrieb David R Newman:
> Hi,
>
> This is an interesting issue.  When you go to the main Admin menu page
> under the System Tools tab what buttons are available for starting and
> stopping the indexer?  What drives whether the Admin status page stays
> Running, Stopped, etc. is the presence of the file:
>
> /usr/share/eprints/var/indexer.pid
>
> and whether there is an indexer process running with the process ID
> stored in this file.  Also the presence of the file
> /usr/share/eprints/var/indexer.tick may affect the status you see.  At
> best guess  the indexer is running but under a different process ID to
> that in indexer.pid.  It should be noted that if your run something like:
>
> ps aux | grep indexer
>
> You would get two processes back (three if you include the "grep
> indexer" process in the command above).  The first is the parent process
> and this should have the process ID that is in indexer.pid and the
> second is a child process.  The latter is vulnerable to dying if the
> indexer task it is undertaking fails in certain ways.  The parent
> process should spawn a new child process if this happens.
>
> Normally what I would do to clear things up is stop the indexer from the
> command line.  Then if any indexer processes are still running use the
> Unix kill command then kill these.  Then finally make sure that the
> /usr/share/eprints/var/indexer.pid and
> /usr/share/eprints/var/indexer.tick files are no longer present and
> delete if they are.  Then I would go back to the web interface to check
> the indexer's status (to ensure it is stopped) and hopefully then use
> the Admin page's "Start Indexer" button to restart the indexer.  This
> should get things back in a consistent state and hopefully the=is
> inconsistency will not re-occur.
>
> Hope this helps.  Please let me know if this does not solve your problem.
>
> Regards
>
> David Newman
>
> Subject:  [3.4.2] Wrong status of indexer on the admin site
> From:  systems <systems@mdc-berlin.de>
> Date: 27/01/2021, 07:28
> To: <eprints-tech@ecs.soton.ac.uk>
>
> Hi list,
> I try to get eprints working on CentOS8.
> As far it looks working, but the on the admin status page, the indexer
> is marked as stopped. But it runs and the task queue is empty.
>
> systemctl tells, that it is running and
> sudo -u eprints /usr/share/eprints/bin/indexer status
> also:
> Indexer is running with PID 10960. Next index in 27 seconds
>
> How will the status page check the state of the indexer?
>
>
> Thanks
> for any hints.
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: https://eur03.safelinks.protection.outlook.com/?url="">
> *** EPrints community wiki:
https://eur03.safelinks.protection.outlook.com/?url="">
>

*** Options:
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/



*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

 

Image removed by sender.

Virus-free. www.avg.com