[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[EP-tech] Re: repos with a mix of HTTP and HTTPS
> From: eprints-tech-bounces at ecs.soton.ac.uk
[mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Yuri
> Sent: Monday, 23 November 2015 18:02
> To: eprints-tech at ecs.soton.ac.uk
>
> Il 23/11/2015 03:34, Matthew Kerwin ha scritto:
>>
>> Hi EPrintsers, I have a query about serving a repository with a mix of
>> HTTP and HTTPS.
>>
>> Currently our two repositories have a pretty standard setup: the bulk
>> of the site is served over plaintext HTTP, including untrusted session
>> cookies. Secure/administrative functions are served over HTTPS.
>>
>
> Interesting, can you post how you didi it? It can be useful.
>
Sure, essentially we set up this sort of core config:
$c->{host} = 'example.org';
$c->{port} = 80;
$c->{securehost} = 'example.org';
$c->{secureport} = 443;
$c->{securepath} = undef; # in theory both secure and insecure path is
'/'
$c->{http_cgi_root} = '/cgi';
$c->{https_root} = '/secure';
$c->{https_cgi_root} = '/secure/cgi';
We also did a bit of work setting up basic httpd VHost rules, with pointers
to our certificates, and a simple rewrite rule to make things run smoothly
[see below]. I think EPrints itself (using Apache::Rewrite) generates most
of the bounces between HTTP and HTTPS, as well as generating appropriate
relative URLs.
This is what I want to unpick.
>
>> We want to reconfigure the server to use HTTPS for the entire site
>> (for various reasons, Google search rankings high amongst them.)
>> However we want to retain the option of plaintext HTTP access so that
>> some less modern external indexers and crawlers can continue to do
>> their thing.
>>
>
> What is the problem in using https by spider instead of http? I would
switch entirely on https.
>
Sure, if I owned the spiders. But our repositories are accessed by other
robots within the university (and I don't have the political clout to force
them to rewrite/upgrade to HTTPS), and by external robots, including some
from the government (and I have no say at all in how those work.)
I want to go entirely HTTPS, but I need to allow some of those robots access
over plaintext.
* footnote: here's the basic gist of the httpd config we use:
-----%<-----
Include /opt/eprints3/cfg/apache/repo.conf
<VirtualHost 1.2.3.4:443>
ServerName example.com
SSLEngine On
SSLCertificateFile /path/to/repo.crt
SSLCertificateKeyFile /path/to/repo.key
PerlTransHandler +EPrints::Apache:::Rewrite
Include /opt/eprints3/archives/repo/cfg/apachevhost_ssl.conf # standard
VHost config
Include /opt/eprints3/apache_ssl/repo.conf # auto-generated; defines
<Location /secure>
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/secure/
RewriteRule ./* http://%{SERVER_NAME}%{REQUEST_URI} [L,R=301]
</VirtualHost>
----->%-----
To be honest I haven't paid that much attention to this config for a while,
so some of it might be reconfigurable.
Cheers
--
Matthew Kerwin | QUT Library eServices | matthew.kerwin at qut.edu.au
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4845 bytes
Desc: not available
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20151123/5493369d/attachment-0001.bin