EPrints Technical Mailing List Archive

Message: #08815


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] mixed-content warnings


Hi Tomasz,


Mixed content warnings is something, I have been trying to improve in recent version of EPrints, so new installs should not suffer these problems.   However, upgrades will still be problematic.  This is because old templates, citations, workflows and even CSS and _javascript_ files may have http URLs in them.  This means you really need to go through all these files and seek out http URLs.


The main problem I have found is the use http_url or http_cgiurl in templates citations and even workflows.  These should ideally use rel_path and rel_cgipath instead but as this does not give your the full URL it might be better to use base_url and perl_url instead.  However, to make sure that these are https not http, you will need to make sure you have either no or an up to date version of 20_baseurls.pl in your archive's cfg/cfg.d/ (assuming you are running 3.4.1+, which it sounds like you are).  This is because of a change made for 3.4.1 to ensure that base_url and perl_url get configured as https if $c->{securehost} is defined.


It is worth grepping across all of your archive's cfg directory for the string "http:" to route out any hardcoded http URLs.


One of the things I did in recent versions of EPrints is provide a way of reconfiguring 10_core.pl to better/more intuitive enable HTTPS everywhere [1].  This ensures all http URL requests are redirected to https without needing to have picked up the HSTS header, which require visiting an https URL at least once (and therefore does not work for stateless bots).  If you deploy HTTPS everywhere, as well as running generate_apacheconf and reloading the webserver, you will need to make sure all browse views and abstract pages are regenerated. 


As you comment in your email below, you are worried about unsetting $c->{host} as it may break things.  I am aware of one issue with this in 3.4.3 core code [2].  However, this is a fairly straightforward fix and is only a problem if your have multiple languages enabled for your repository.  If you use the Repository Links Bazaar plugin [3], that will also require a similar fix.  I think there may be one or two other Bazaar plugins that use $c->{host} but I cannot remember what they are off the top of my head.


If you look at perl_lib/EPrints/URL.pm line 129 [4] you should see the line:


if ( EPrints::Utils::is_set( $session->config( "securehost" ) ) && ( $opts{scheme} eq "https" || !EPrints::Utils::is_set( $session->config( "host" ) )


If you have HTTPS everywhere configuration enabled this should ensure HTTPS URLs are always used for things like the thumbnail URLs you describe having a problem with.  However, if you are not using HTTPS everywhere configuration you will still get http URLs for thumbnails and similar.  I would therefore recommending enabling this and I will see if I can track down the Bazaar plugins that may be affected by $c->{host} being undefined.


The problem with EPrints is it has gone through various iterations of HTTP/HTTPS use:


1. No HTTPS

2. HTTP for public pages and HTTPS for back-end admin pages.

3. HTTPS for all pages


This means as the code has evolved over time how to configure the appropriate URLs in various situation has got progressively more complicated, as way of supporting these different approaches for HTTPS have been incorporated into ePrints over the year.  I go in to a bit of detail about this in the EPrints 3.4.3 release page [5].  I still don't think this is perfect, as there is the potential requirements in Bazaar plugins or bespoke archive code/configuration that require $c->{host} to be defined.  However, after a lot of consideration, the changes I made for 3.4.3 tried to make the best compromise between fixing the mixed content warnings, simplifying URLs config variables and their use and not seriously breaking existing repositories when they are upgraded.


Regards


David Newman


[1] https://wiki.eprints.org/w/Simplified_HTTPS_Configuration

[2] https://github.com/eprints/eprints3.4/issues/118

[3] http://bazaar.eprints.org/379/

[4] https://github.com/eprints/eprints3.4/blob/master/perl_lib/EPrints/URL.pm#L129

[5] https://wiki.eprints.org/w/EPrints_3.4.3#Configuration_URLs_and_Paths


On 23/12/2021 23:12, Tomasz Neugebauer via Eprints-tech wrote:
CAUTION: This e-mail originated outside the University of Southampton.
I thought​ that I resolved all of the "mixed content" warnings on our repository a while back, but after a recent upgrade from 3.3.12 to 3.4.3, I noticed that I have some mixed content warnings again, specifically on the thumbnails on the abstract pages.  I might have missed some of these warning before, though, so this might not be a new issue after the upgrade.  

Because I have HSTS headers, the browser redirects those those requests to HTTPS, but I would like to fix it.  Both the SRC and the HREF of the thumbnails for PDFs are referenced as HTTP instead of HTTPS.  The only thing that fixed it during my testing was if I was to remove (comment out) " the $c->{host}  line/ariable in 10_core.pl
That resolves the issue, but I'm worried to apply this change because I don't know if something else might rely on that variable.

I spent a good part of a day trying to follow the code, and I know that the {scheme} variable in URL.pm doesn't get properly set to https in the case of the thumbnails, but the code is so confusing when it comes to the thumbnail URLs that I can't figure out why.  I do have a suspicion that there is a bug in the core code somewhere, but perhaps it is something in our own configuration. 
I know this issue is not new to this list, in fact, I wrote the first drafts of the HSTS page on the Wiki (https://wiki.eprints.org/w/HTTPS-only_and_HSTS), but looking through the updated page there and any recent exchanges that relate to this didn't help me figure it out.  
Let me know if you have any ideas?

Best wishes,
Tomasz




*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

Virus-free. www.avg.com