EPrints Technical Mailing List Archive

Message: #08840


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Internal server error when refreshing views


Hi Jim,

Regarding installing XML::LibXML on EPrints, I think the page that you may have seen the guide for how to install this was on the EPrints installation page for Debian / Ubuntu:

https://wiki.eprints.org/w/Installing_EPrints_on_Debian/Ubuntu#Installing_EPrints_from_Source

My general advice would always be to install from the OS's package manager rather than CPAN.  You may get a slightly earlier version of the Perl module but it will get updated automatically when you do package upgrades.  If you use CPAN there is a good chance this will never get upgraded until you move server again.

I don't think it matters that the version specified is 1.63.  As I understand the use command will not complain unless the version installed is earlier than this.  I typically run version 2.0018 at the moment and EPrints does not complain about this not being 1.63.  Why 1.63 was specified is a a legacy thing before my time.  I assume there was functionality missing in earlier versions.  However, now no one would be using such an old version it is probably superfluous.  Although it should make no difference with modern systems, it is probably best to leave the version number to use alone, to avoid any unintended consequences.

The way that EPrints XML init code works is as follows:

1. Tests if XML::LibXML is installed and returns if yes.

2. If not returned, prints a deprecation warning.

3. Tests if XML::GDOME is installed and returns if yes.

4. If not returned uses XML::DOM which is provided as part of the EPrints codebase, so will be initialised for XML as a last resort.  This is what caused you your initial error message, as this packaged with EPrints version of XML::DOM has not be updated for quite some time.

I am looking towards just using XML::LibXML in later versions of EPrints, probably 3.5.x.  This is probably a couple of years off based on current timescales.  At which point EPrints would fail when you run "epadmin test" and will tell you how to fix (i.e. install XML::LibXML).  As other options will be removed, I don't think there is much point updating the packaged with EPrints version of XML::DOM to address the original issue you reported.

Regards

David Newman

On 17/01/2022 19:32, Jim Brinkley via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton.

John,

                That did it! A few more specifics in case someone else has this problem:

 

The main issue turned out to be that I hadn’t installed XML::LibXML even though I thought I had. I found that out when I tried your suggestion:

perl -e 'use XML::LibXML 1.63;' which told me the module wasn’t installed.

 

So in Ubuntu I did apt-get install libxml-libxml-perl, which  I saw somewhere is recommended over installing by CPAN (don’t remember where). I don’t know whether this installed 1.63, but the perl
“use..” statement seemed to be OK.

 

After this “epadmin test” had no errors, as did “generate_views. “, and I no longer get the Internal Server error when I go to a new view.

 

A little more detail:

After looking at XML.pm   I see (as you note) that the system will try to use LibXML if $c->(enable_libxml) does not exist (which it doesn’t in my case). Given that I hadn’t installed LibXML properly the "use EPrints::XML::LibXML; 1"; at line 73 will fail and it will default to requiring Eprints::XML::DOM at line 85. (Assuming I remember my perl correctly).

 

I also noted that the $c datastructure comes from SystemSettings:conf, which I believe is written by the installer. So somehow the configure script of the installer must have throught I did have LibXML . That’s easily possible because I first installed this in 2019 on CentOS after migrating and upgrading from an earlier Eprints2 installation, then migrated again to Ubuntu. So probably I missed some step in the migration process.

 

In any case it seems to work now. I really appreciate your taking the time to help me with this. And after looking at the code more I also appreciate how much effort has gone into creating this very nice program.

 

Thanks,

 

Jim

 

From: John Salter <J.Salter@leeds.ac.uk>
Date: Monday, January 17, 2022 at 2:52 AM
To: Jim Brinkley <brinkley@uw.edu>, "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Subject: RE: Internal server error when refreshing views

 

Hi Jim,
I've had a better look, and this is the block that checks which XML modules to use, and also produces the warning you see:
https://github.com/eprints/eprints3.4/blob/master/perl_lib/EPrints/XML.pm#L65-L86

 

The option to select which XML library to use is:

$c->{enable_libxml}

 

From the above, EPrints will try to use LibXML if:
$c->{enable_libxml} *does not* exist

$c->{enable_libxml} exists and is set to 1

 

If either:
-  'enable_libxml' is set to 0

- the module EPrints::XML::LibXML produces errors

then the deprecation warning message is produced.

 

So, try grepping for 'enable_libxml' in

~/lib/

~/archives/ARCHIVEID/cfg/

~/cfg/

 

First I would try checking LibXML is happy. Try running these on the commandline (as the EPrints user):
perl -e 'use XML::LibXML 1.63;'

perl -e 'use XML::LibXML::SAX;'

If either of the above lines produce errors or warnings, I'd look at fixing them first.

 

Secondly, I'd try grepping for 'enable_libxml' in:

  [EPRINTS_ROOT]/lib/

  [EPRINTS_ROOT]/cfg/
  [EPRINTS_ROOT]/archives/[ARCHIVE_ID]/cfg/

 

If it is explicitly disabled, try commenting-out that line, and running:
  [EPRINTS_ROOT]/bin/epadmin test

To see if things look happy.

 

Cheers,

John

 

 

From: John Salter
Sent: 14 January 2022 20:38
To: Jim Brinkley <brinkley@uw.edu>; eprints-tech@ecs.soton.ac.uk
Subject: Re: Internal server error when refreshing views

 

Hi Jim,

If memory serves me correctly, somewhere* in the EPrints config, there is an option to use either DOM, or LibXML.

This faint memory ties in with your note about previous upgrades. It seems like your install might not be using LibXML, even though you've added the packages to the server.

 

*the 'somewhere' is possibly the crux here. My v3.4 knowledge isn't as ingrained as v3.3, and I'm not at my computer at the moment.

 

If you try grepping for 'DOM' in:

~/lib/cfg

~/perl_lib/SystemSettings.pl

~/cfg/

~/archives/ARCHIVE_ID/cfg/

do you find any options that say 'use XML::DOM' in some way (although not a literal perl 'use XML::DOM' statement)?

 

Cheers,

John


From: Jim Brinkley <brinkley@uw.edu>
Sent: 14 January 2022 20:13
To: John Salter <J.Salter@leeds.ac.uk>; eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Cc: Jim Brinkley <brinkley@uw.edu>
Subject: Re: Internal server error when refreshing views

 

PS I just noticed that the error I see in the Apache server log whenever I get the Internal Server Error is exactly the same one I see at the end of the output below.

 

 

From: Jim Brinkley <brinkley@uw.edu>
Date: Friday, January 14, 2022 at 12:07 PM
To: John Salter <J.Salter@leeds.ac.uk>, "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Cc: Jim Brinkley <brinkley@uw.edu>
Subject: Re: Internal server error when refreshing views

 

John,

                Thanks for you quick reply. My guess is the problem is not to do with the specific subject “3-D Reconconstruction” because it occurs for all subjects in my list, including single word subjects, such as “MindSeer”. However I just now ran

[EPRINTS_ROOT]/bin/generate_views [ARCHIVE_ID] --view subjects

as you suggested, and got the following output:

 

eprints@synapse:/opt/eprints3$ bin/generate_views sigpubs --view subjects

*** DEPRECATION WARNING ***

In future versions, EPrints will be standardising to only support the LibXML library for providing XML functionality.  Please ensure LibXML is installed before upgrading EPrints.

Subroutine parse_xml_string redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 119.

Subroutine _parse_url redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 144.

Subroutine parse_xml redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 164.

Subroutine event_parse redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 211.

Subroutine _dispose redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 248.

Subroutine clone_and_own redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 261.

Subroutine document_to_string redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 276.

Subroutine make_document redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 286.

Subroutine version redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 295.

Can't use an undefined value as an ARRAY reference at /opt/eprints3/bin/../perl_lib/XML/DOM/NamedNodeMap.pm line 142.

 

I’ve seen something like this everytime I’ve run generate_views. I thought maybe I should update LibXML, so in Ubuntu I did apt install libxml-perl, and similar for some of the other ones in the above list. But from the message above it looks like the perl modules in eprints3..perl_lib are redefining existing ones in Ubuntu, so that last error seems to be happening in the version of XML::DOM in eprints3..perl_lib.

 

I should also say that the last time I moved sigpubs to a new server (around 2019) I upgraded from Eprints 2 to 3, and I think this problem has been happening ever since. I pretty much ignored it until now because I didn’t have time to deal with it, and I could always tell people to just refresh the browser, but now I have a bit more time and it would be nice to fix this.

 

Jim

 

 

 

From: John Salter <J.Salter@leeds.ac.uk>
Date: Friday, January 14, 2022 at 11:27 AM
To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>, Jim Brinkley <brinkley@uw.edu>
Subject: Re: Internal server error when refreshing views

 

Hi Jim,
At a guess, this sounds like something that is trying to group records together into a browse view mis-treating the '3-D ...' text - maybe incorrectly normalising it to create the A / B / C ... links at the top of the view page.

 

If you have access to the server, and run:

[EPRINTS_ROOT]/bin/generate_views [ARCHIVE_ID] --view subjects

does it give any additional warnings/errors?

 

Cheers,

John

 


From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Jim Brinkley via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
Sent: 14 January 2022 19:02
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Internal server error when refreshing views

 

CAUTION: This e-mail originated outside the University of Southampton.

Hi,

                I run an eprints3 site at sigpubs.si.washington.edu. It generally works fine except when a view regenerates I get an internal server error.

 

To recreate this issue I can login, go to Admin:System Tools, and then click on Regenerate Views, which as I understand it, causes all views to be regenerated whenever I request them. For example, I can go to  menu Browse:Browse by Subjects, then click on any of the subjects in my customized subject list, as for example, my first subject, "3-D Reconstruction". I then get "Internal Server Error". If I then refresh the browser page the error goes away and the correct view appears. This view then remains correct for as long as I've tested it, but I think there may be a timeout when it gets regenerated again.

 

I looked in the apache server log and find this error whenever I get the Internal Server Error:

Can't use an undefined value as an ARRAY reference at /opt/eprints3/perl_lib/XML/DOM/NamedNodeMap.pm line 142.\n, referer: http://sigpubs.si.washington.edu/view/subjects/

 

I  am running what I think is the latest stable version of Eprints: 3.4.3, on Ubuntu 20.04, perl version 5.30, apache2. Apache is running as www-data.www-data. The repository is owned by eprints.eprints, and www-data is in group eprints so it can write files to the repository. I don't think permissions are the issue because once I refresh the page the view is OK, and the files in the views directory are changed.

 

I've searched the web and this mailing list and can't find this particular situation. Any suggestions? Thanks.

 

Jim Brinkley

Structural Informatics Group

University of Washington

Seattle USA

http//:si.washington.edu

 


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

Virus-free. www.avg.com