EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #10128


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] DDoS of EPrints advanced search


Hi Florian,

Out of interest, what version of MySQL/MariaDB are you running? When we were running CentOS 7 that had MariaDB 5.5 we found the issue with particularly complex SQL queries (certain advanced searches) that would take a long time to run and there would be a query trying to drop the 'cache' table that would get queued whilst the original query that created the cache table was still running.  Since we have moved to Rocky Linux 9 that runs MariaDB 10.5, we have not had as any significant problems like this. However, until this unhelpful bot behaviour started, it would have been very unusual for 100+ searches to be made in only a few minutes.  More normally it might take tens of minutes, if not hours for that number of searches.

In an ideal world we would like to remove 'cache' tables for search in future versions of EPrints as modern MySQL/MariaDB can do this quite well natively, if suitably configured.  However, the way 'cache' tables are create a quite ingrained into EPrints, if they were removed we would need to ensure that what MySQL/MariaDB gets handed as a query sufficiently matches something it has cached natively.

Regards

David Newman

On 02/06/2025 09:45, Florian Heß wrote:
CAUTION: This e-mail originated outside the University of Southampton.

CAUTION: This e-mail originated outside the University of Southampton.

Hi John,

in addition to that we also experience apparently regular race
conditions between selecting from and dropping cache tables, that may
lock database access which has actually happened quite often. After
killing the mysql process that runs for a long time (`mysql> show full
processlist;`), all waiting requests will be processed.


Kind regards
Florian

Am 30.05.25 um 15:39 schrieb John Salter:
I added a script to my server to log the number of search cache tables, and the min/max IDs of them for each hour. I plan to use this to redirect requests with 'old' cache ids in the query-string to a static page, which will describe (to a human) how to re-run their search, but not provide a clickable link to do so. If others are also seeing this pattern, I can share my stuff once it's ready.


*** Options: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FEprints-tech_Mailing_List&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Ce4ba25307325469ddb9908dda1b54093%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638844522041896094%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=OZgO2VdGEOmi8fbqNaBOSrtaPyOIen6iU6H51ZXxzvI%3D&reserved=0
*** Archive: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Ce4ba25307325469ddb9908dda1b54093%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638844522041930618%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=tX3HTYNHzibO7LovHgQygp%2BQIP%2BziSUFyvdEs%2Bn0%2Bnk%3D&reserved=0
*** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Ce4ba25307325469ddb9908dda1b54093%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638844522041955177%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=Tls4V1S8yYVrHLtvkjjsjDN9EEUbiySxz8%2FjQyruA9Q%3D&reserved=0