EPrints Technical Mailing List Archive

Message: #08838


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Internal server error when refreshing views


Hi both,

I thought I should provide a bit if context to the introduction of the deprecation warning. 

I have been trying to extend the acceptance testing for the EPrints codebase as part of my CI framework.  This means testing of different EPrints setups.  At the moment I run separate daily testing for the zero and publication flavours of EPrints 3.4.  Having three potential XML libraries (XML::LibXML, XML::GDOME and XML::DOM) used by EPrints would require running separate testing against each library. 

Therefore, I did some investigation into the status of all three libraries.  My conclusions were that LibXML is effectively the industry standard.  It is easily available across all platforms EPrints is supported and looks the best supported going forward.  The latter of these is very important as XML import/export for EPrints is a critical feature and also a potent vector for cyberattacks.  So I want to make sure the underlying library will be patched for vulnerabilities and these will be rolled out in a way that it is easy for those deploying EPrints to upgrade, (i.e. through OS package management).

As John has identified, EPrints 3.4 (and 3.3) will try to use LibXML unless it is explicitly disabled (slightly confusingly by setting enable_libxml to 0).  If it is not disabled and XML::LibXML is not installed, then XML initialisation will fail rather than try to see if XML::GDOME is installed.  Therefore, as he suggests grepping for 'enable_libxml' through the directory he lists should help you find the config line you need to comment out.  I would also added ~/site_lib/ to that list, if it exists on your EPrints repository server.

Regards

David Newman

On 17/01/2022 10:52, John Salter via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton.

Hi Jim,
I've had a better look, and this is the block that checks which XML modules to use, and also produces the warning you see:
https://github.com/eprints/eprints3.4/blob/master/perl_lib/EPrints/XML.pm#L65-L86

 

The option to select which XML library to use is:

$c->{enable_libxml}

 

From the above, EPrints will try to use LibXML if:
$c->{enable_libxml} *does not* exist

$c->{enable_libxml} exists and is set to 1

 

If either:
-  'enable_libxml' is set to 0

- the module EPrints::XML::LibXML produces errors

then the deprecation warning message is produced.

 

So, try grepping for 'enable_libxml' in

~/lib/

~/archives/ARCHIVEID/cfg/

~/cfg/

 

First I would try checking LibXML is happy. Try running these on the commandline (as the EPrints user):
perl -e 'use XML::LibXML 1.63;'

perl -e 'use XML::LibXML::SAX;'

If either of the above lines produce errors or warnings, I'd look at fixing them first.

 

Secondly, I'd try grepping for 'enable_libxml' in:

  [EPRINTS_ROOT]/lib/

  [EPRINTS_ROOT]/cfg/
  [EPRINTS_ROOT]/archives/[ARCHIVE_ID]/cfg/

 

If it is explicitly disabled, try commenting-out that line, and running:
  [EPRINTS_ROOT]/bin/epadmin test

To see if things look happy.

 

Cheers,

John

 

 

From: John Salter
Sent: 14 January 2022 20:38
To: Jim Brinkley <brinkley@uw.edu>; eprints-tech@ecs.soton.ac.uk
Subject: Re: Internal server error when refreshing views

 

Hi Jim,

If memory serves me correctly, somewhere* in the EPrints config, there is an option to use either DOM, or LibXML.

This faint memory ties in with your note about previous upgrades. It seems like your install might not be using LibXML, even though you've added the packages to the server.

 

*the 'somewhere' is possibly the crux here. My v3.4 knowledge isn't as ingrained as v3.3, and I'm not at my computer at the moment.

 

If you try grepping for 'DOM' in:

~/lib/cfg

~/perl_lib/SystemSettings.pl

~/cfg/

~/archives/ARCHIVE_ID/cfg/

do you find any options that say 'use XML::DOM' in some way (although not a literal perl 'use XML::DOM' statement)?

 

Cheers,

John


From: Jim Brinkley <brinkley@uw.edu>
Sent: 14 January 2022 20:13
To: John Salter <J.Salter@leeds.ac.uk>; eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Cc: Jim Brinkley <brinkley@uw.edu>
Subject: Re: Internal server error when refreshing views

 

PS I just noticed that the error I see in the Apache server log whenever I get the Internal Server Error is exactly the same one I see at the end of the output below.

 

 

From: Jim Brinkley <brinkley@uw.edu>
Date: Friday, January 14, 2022 at 12:07 PM
To: John Salter <J.Salter@leeds.ac.uk>, "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Cc: Jim Brinkley <brinkley@uw.edu>
Subject: Re: Internal server error when refreshing views

 

John,

                Thanks for you quick reply. My guess is the problem is not to do with the specific subject “3-D Reconconstruction” because it occurs for all subjects in my list, including single word subjects, such as “MindSeer”. However I just now ran

[EPRINTS_ROOT]/bin/generate_views [ARCHIVE_ID] --view subjects

as you suggested, and got the following output:

 

eprints@synapse:/opt/eprints3$ bin/generate_views sigpubs --view subjects

*** DEPRECATION WARNING ***

In future versions, EPrints will be standardising to only support the LibXML library for providing XML functionality.  Please ensure LibXML is installed before upgrading EPrints.

Subroutine parse_xml_string redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 119.

Subroutine _parse_url redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 144.

Subroutine parse_xml redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 164.

Subroutine event_parse redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 211.

Subroutine _dispose redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 248.

Subroutine clone_and_own redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 261.

Subroutine document_to_string redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 276.

Subroutine make_document redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 286.

Subroutine version redefined at /opt/eprints3/bin/../perl_lib/EPrints/XML/DOM.pm line 295.

Can't use an undefined value as an ARRAY reference at /opt/eprints3/bin/../perl_lib/XML/DOM/NamedNodeMap.pm line 142.

 

I’ve seen something like this everytime I’ve run generate_views. I thought maybe I should update LibXML, so in Ubuntu I did apt install libxml-perl, and similar for some of the other ones in the above list. But from the message above it looks like the perl modules in eprints3..perl_lib are redefining existing ones in Ubuntu, so that last error seems to be happening in the version of XML::DOM in eprints3..perl_lib.

 

I should also say that the last time I moved sigpubs to a new server (around 2019) I upgraded from Eprints 2 to 3, and I think this problem has been happening ever since. I pretty much ignored it until now because I didn’t have time to deal with it, and I could always tell people to just refresh the browser, but now I have a bit more time and it would be nice to fix this.

 

Jim

 

 

 

From: John Salter <J.Salter@leeds.ac.uk>
Date: Friday, January 14, 2022 at 11:27 AM
To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>, Jim Brinkley <brinkley@uw.edu>
Subject: Re: Internal server error when refreshing views

 

Hi Jim,
At a guess, this sounds like something that is trying to group records together into a browse view mis-treating the '3-D ...' text - maybe incorrectly normalising it to create the A / B / C ... links at the top of the view page.

 

If you have access to the server, and run:

[EPRINTS_ROOT]/bin/generate_views [ARCHIVE_ID] --view subjects

does it give any additional warnings/errors?

 

Cheers,

John

 


From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Jim Brinkley via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
Sent: 14 January 2022 19:02
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Internal server error when refreshing views

 

CAUTION: This e-mail originated outside the University of Southampton.

Hi,

                I run an eprints3 site at sigpubs.si.washington.edu. It generally works fine except when a view regenerates I get an internal server error.

 

To recreate this issue I can login, go to Admin:System Tools, and then click on Regenerate Views, which as I understand it, causes all views to be regenerated whenever I request them. For example, I can go to  menu Browse:Browse by Subjects, then click on any of the subjects in my customized subject list, as for example, my first subject, "3-D Reconstruction". I then get "Internal Server Error". If I then refresh the browser page the error goes away and the correct view appears. This view then remains correct for as long as I've tested it, but I think there may be a timeout when it gets regenerated again.

 

I looked in the apache server log and find this error whenever I get the Internal Server Error:

Can't use an undefined value as an ARRAY reference at /opt/eprints3/perl_lib/XML/DOM/NamedNodeMap.pm line 142.\n, referer: http://sigpubs.si.washington.edu/view/subjects/

 

I  am running what I think is the latest stable version of Eprints: 3.4.3, on Ubuntu 20.04, perl version 5.30, apache2. Apache is running as www-data.www-data. The repository is owned by eprints.eprints, and www-data is in group eprints so it can write files to the repository. I don't think permissions are the issue because once I refresh the page the view is OK, and the files in the views directory are changed.

 

I've searched the web and this mailing list and can't find this particular situation. Any suggestions? Thanks.

 

Jim Brinkley

Structural Informatics Group

University of Washington

Seattle USA

http//:si.washington.edu

 


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

Virus-free. www.avg.com