EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #10168


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] DDoS on simple and advanced search


CAUTION: This e-mail originated outside the University of Southampton.
I was wondering if there isn't a better "apache" way of doing this, if we know that specific URLs are being targeted.  

I got this advice from the AI, wondering if it is a worthwhile and/or functional solution to the issue?  We would have to replace the URL part with the correct one for advanced and simple search.  It seems like it's a way of targetting this specific part, rejecting too many requests from the same IP per minute.  


This is the most direct way to rate-limit by URL path and IP.

Install mod_security
sudo apt install libapache2-mod-security2

Enable it
sudo a2enmod security2
sudo systemctl reload apache2

Open your ModSecurity config file, usually located at:
Add the following rules:

# Track requests to /advanced_search/ SecAction "id:900001,phase:1,nolog,pass,t:none,setvar:ip.advsearch_counter=0" SecRule REQUEST_URI "@beginsWith /advanced_search/" \    "id:900002,phase:1,nolog,pass,t:none,setvar:ip.advsearch_counter=+1,expirevar:ip.advsearch_counter=60" # Block if too many requests per minute SecRule IP:advsearch_counter "@gt 10" \    "id:900003,phase:1,deny,status:429,msg:'Too many requests to /advanced_search/ from your IP'"

Any comments on that solution?  It seems elegant, if it works?

Tomasz

________________________________________________

Tomasz Neugebauer
Senior Librarian | Bibliothécaire titulaire
Digital Projects & Systems Development Librarian / Bibliothécaire des Projets Numériques & Développement de Systèmes
Concordia University / Université Concordia

Tel. / Tél. 514-848-2424 ext. / poste 7738
Email / courriel:
tomasz.neugebauer@concordia.ca

Mailing address / adresse postale: 1455 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8
Street address / adresse municipale: 1400 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8

library.concordia.ca



From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> on behalf of David R Newman <drn@ecs.soton.ac.uk>
Sent: June 27, 2025 5:56 AM
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>; Martin Brändle <martin.braendle@uzh.ch>
Cc: Jens Witzel <jens.witzel@uzh.ch>
Subject: Re: [EP-tech] DDoS on simple search
 

Attention This email originates from outside the concordia.ca domain. // Ce courriel provient de l'extérieur du domaine de concordia.ca




Hi Martin,

I have to admit, I was surprised that simple search had been somewhat untouched during the DDoS that has been occurring on some repositories' advanced search.  I assume like the DDoS on advanced search, the range of IPs is so wide that it is impossible to block without blocking large chunks of the Internet.

The interesting thing here is that unless you have bespoke configured your simple search, the search _expression_ (i.e. exp in the GET header) looks like what I would expect to see for an advanced search, below is a more like what I would expect to see for a simple search:

/cgi/search/simple?screen=Search&_action_search=1&exp=0%7C1%7C-date%2Fcreators_name%2Ftitle%7Carchive%7C-%7Cq%3A%3AALL%3AIN%3Afrog%7C-%7C&order=-date%2Fcreators_name%2Ftitle&search_offset=20

As the only search field in simple search is usually called 'q'.  Approve is a search for the word frog as this appears regularly in the test data.  It looks like they are using the same _expression_ that might use on advanced search and sending it simple search.  Therefore, it may be sufficient to modify the  LocationMatch from "^/cgi/search/archive/advanced" to just "^/cgi/search/" and then the if statements on the QUERY_STRING inside would apply to search expressions for any search.  Obviously, even though the search _expression_ used on simple search looks like that for advanced search, it may not exactly match those previous DDoS search requests that you originally blocked.

Regards

David Newman

On 27/06/2025 08:37, Martin Brändle wrote:
CAUTION: This e-mail originated outside the University of Southampton.
CAUTION: This e-mail originated outside the University of Southampton.

Dear all,

 

after the attack on advanced search, we see now a similar attack on simple search. That was for a while under our radar, hidden by the attacks from alibaba-inc.com.

 

grep -c /cgi/search/simple /var/log/httpd/access_log_zora-20250*

/var/log/httpd/access_log_zora-20250403:272

/var/log/httpd/access_log_zora-20250410:225

/var/log/httpd/access_log_zora-20250417:848

/var/log/httpd/access_log_zora-20250424:474

/var/log/httpd/access_log_zora-20250501:531

/var/log/httpd/access_log_zora-20250508:1249

/var/log/httpd/access_log_zora-20250515:1660

/var/log/httpd/access_log_zora-20250522:2277

/var/log/httpd/access_log_zora-20250529:1565

/var/log/httpd/access_log_zora-20250605:6203

/var/log/httpd/access_log_zora-20250612:50389

/var/log/httpd/access_log_zora-20250619:44590

/var/log/httpd/access_log_zora-20250626:73182

 

(numbers of accesses to cgi/search/simple is usually very low in our case because we don’t offer the link to /cgi/search/simple in a user’s browser because of our use of Elasticsearch as main search engine).

 

The queries are of a similar format as with advanced search: filling in some terms in the search field and cycling through pages, e.g.

 

/cgi/search/simple?_action_search=1&cache=22035948&exp=0%7C1%7C-date%2Fcreators_name%2Feditors_name%2Ftitle%7Carchive%7C-%7Cq%3Aabstract%2Fbook_title%2Fcreators_name%2Fcreators_orcid%2Fdate%2Fdocuments%2Fdoi%2Feditors_name%2Feditors_orcid%2Fisbn%2Fkeywords%2Fpublication%2Fpubmedid%2Ftitle%3AALL%3AIN%3Aulrich+mehnert%7C-%7Ceprint_status%3Aeprint_status%3AANY%3AEQ%3Aarchive%7Cmetadata_visibility%3Ametadata_visibility%3AANY%3AEQ%3Ashow&order=-date%2Fcreators_name%2Feditors_name%2Ftitle&screen=Search&search_offset=60

 

Measures taken: Similar to https://www.eprints.org/eptech/msg10122.html

 

Kind regards

 

Martin

 

--

Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Pfingstweidstrasse 60B
CH-8005 Zürich

mail:
martin.braendle@uzh.ch
phone: +41 44 63 56705
signature_2066573683
https://orcid.org/0000-0002-7752-6567
https://www.zi.uzh.ch

 

Interessiert an Neuigkeiten zu Open Science an der UZH?

Folgen Sie UZH Open Science auf LinkedIn