EPrints Technical Mailing List Archive

Message: #09303

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Cache Files - Always look for new

CAUTION: This e-mail originated outside the University of Southampton.


 can you show the http headers when the file is hit (when you append ?a=b for example)? I think apache is not setting the appropriate headers, based on the file timestamp.

Il 04/05/23 16:14, James Kerwin via Eprints-tech ha scritto:
CAUTION: This e-mail originated outside the University of Southampton.
Hi John,

Thanks for this. I noticed my requests weren't even getting sent to the server according to the access logs. I tried appending ?a to the end and that request did make it and I can see it in the logs at the expected time.

It's very frustrating, but I think I'm going to need to find a polite way of saying "please press the refresh button". That or wait until their cache clears itself at some unknown date in the future. I may attempt to make the ssl changes in my normal http conf file to see what that does. I expect it won't work as we're now entirely https, but who knows (not me!).


On Thu, May 4, 2023 at 2:55 PM John Salter <J.Salter@leeds.ac.uk> wrote:
Hi James,
PDF caching is a PITA.
If someone has already downloaded a copy, the browser will already have cached it - so you might not even see a request hitting the server (and therefore your Apache changes are never 'hit').

Can you confirm that you see entries in the access log from people with a cached copy - e.g. the browser is actually making a call to the server?

My normal advice is to append a random query string to the end of the PDF URL:
sort of thing - this will cause a reload - but your user has already said 'no' to this.

If requests are hitting the server, you could use e.g. an EPrints URL rewrite trigger to append a querystring and respond with a redirect.
If requests aren't hitting the server, you'll have to get them to visit a new URL... manually.


From: eprints-tech-bounces@ecs.soton.ac.uk <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of James Kerwin via Eprints-tech <eprints-tech@ecs.soton.ac.uk>
Sent: 04 May 2023 14:45
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Cache Files - Always look for new
CAUTION: This e-mail originated outside the University of Southampton.
Hi All,

A user uploaded a file to our data repository, opened it, noticed a mistake, deleted the file and reuploaded the new file. When they click the link and the pdf opens in the browser it opens the original file because, I assume, it's been cached.

This happens rarely and when it does we ask the user to refresh the page with the document or do a ctrl+f5 refresh to clear the cache for that page. In this instance the user is insisting that they couldn't possibly ask this of the people they had already shared the link with.

I've made changed to my apache ssl conf to include:

<FilesMatch ".(js|css|jpg|jpeg|png|gif|js|css|ico|swf|pdf|html)$">
  <IfModule mod_headers.c>
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header unset Last-Modified

I am not an apache expert by any stretch of the imagination. After restarting apache this has not resolved the issue. Can anybody advise? Maybe there is a specific EPrints THING I need to be aware of? We're on EPrints 3.4.4.


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/