EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #10181


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Preventing old cache searches automatically re-searching


CAUTION: This e-mail originated outside the University of Southampton.

Hi,
One of the aspects of the current DDoS traffic are requests for search results pages a long time after their cached results have been removed from the system.

 

I have documented this here:

https://github.com/eprints/eprints3.4/issues/479

and a possible (not ready for production yet!) fix here:

https://github.com/jesusbagpuss/eprints3.4/tree/iss-479

 

If a search URL has a cache parameter, and that cache no longer exists, and the repo config has this config option (see ~/lib/cfg.d/misc.pl):
$c->{cache_not_found_no_search} = 1;

Then rather than the search being automatically re-run, a ‘search cache not found’ page is presented with a URL that a user could paste into their browser.

 

Including a clickable link might just make the DDoS follow the link in the future. The URL uses some _javascript_ to construct the URL.

 

A normal search results page can include many links for both paginated results and different export formats (view page source to see these). My observations show that these are all followed by the DDoS bot swarms.

 

If you are in a position to try the changes in https://github.com/jesusbagpuss/eprints3.4/tree/iss-479 on a test repository, I’d be most grateful.

To test:-

  1. merge the code
  2. make sure

$c->{cache_not_found_no_search} = 1;

                Is set

  1. Test/Restart apache
  2. Run a search that returns multiple pages
  3. Navigate to the second page
  4. In the URL, change the ‘cache’ parameter to a different number e.g. cache=12345 to cache=999999999
  5. You should get a page with a couple of URLs that can be cut-and-paste to re-run the search.

 

Please let me know if you find any issues with this approach.

 

Cheers,

John

 

John Salter

https://orcid.org/0000-0002-8611-8266

 

White Rose Libraries Technical Officer
Library and Research Management team, IT
University of Leeds