EPrints Technical Mailing List Archive

Message: #09214

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Search filters: negation or searching for "either X or NULL"

CAUTION: This e-mail originated outside the University of Southampton.
--- Begin Message ---
Hi John,

thank you very much for your reply!

Too bad the filters don't allow this. Your suggestions are helpful and interesting, but seem a bit too hacky for me. We've decided to rather take measures to replace all past and future NULLs with FALSE and keep the OAI part simple.

Thanks again and best regards!

Am 17.02.23 um 19:01 schrieb John Salter:
Hi Dennis,
Simple-ish sounding question... complex answer (sorry!).
**There may be a 'not equal to' operator that I've overlooked. If so, hopefully someone will jump in!**

I think the OAI-PMH code prevents you from doing this in a simple way, as the 'filters' configuration doesn't allow this approach, as they are joined with an 'AND'.
I have previously added a volatile/automatic field to our repository to pre-calculate useful values to overcome this sort of thing, and allow use of the normal OAI 'filters' approach.

You can create a set with just the undef values like this:
     spec => "undef-test",
     name => "Undef Test",
     filters => [
         { meta_fields => [ "my_field" ], value=> undef, match=>'EX' }, #this results in a query like  'my_field IS NULL'.

If you're happy hacking about in cgi/oai2, this searches for an un-set field:
     fields => [
       $eprint_ds->field( 'my_field' )
     value => undef,
     match => "EX",

I think it is possible with some hacking about in cgi/oai2 to change the join method to an OR - but might not be a good idea (see: https://www.eprints.org/eptech/msg07402.html )

There is another possible route (it's even more horrible, hacky, but interesting at the same time).
It might not be the most sensible for OAI-PMH for performance reasons.

EPrints doesn't seem to ship with any 'NOT' type search conditions.
It does however ship with a 'regexp' search condition (EPrints::Search::Condition::Regexp) - which isn't very documented, and doesn't have an easy way to use it. Note: it uses MySQL RegExp - which might be Posix instead of PCRE (recent MariaDB might be PCRE..?).

It does seem to work :)

You can do something like this to get an EPrints::List:
my $ds = $session->dataset( "eprint" );
my @conds = ();
# my_field is null... (a null value wouldn't match the regex clause below)
push @conds, EPrints::Search::Condition->new( 'is_null', $ds, $ds->get_field( "my_field" ) );
# my_field does not start with a 'T'
push @conds, EPrints::Search::Condition->new( 'regexp', $ds, $ds->get_field( "my_field" ), '^[^T]' );

# OR the two conditions above together. Then AND the result with whether the datestamp is set.
my $cond = EPrints::Search::Condition->new(
	EPrints::Search::Condition->new( "OR", @conds ),
	EPrints::Search::Condition->new( 'is_not_null', $ds, $ds->get_field( "datestamp" ) )
my $ids = $cond->process(
         session => $session,
         dataset => $ds,

my $list = EPrints::List->new(
         session => $session,
         dataset => $ds,
         ids => $ids,

I feel really bad even suggesting the above - as it's a bit 'deep'.
I used the regex search condition for finding specific history items (action='note', details =~ /^Embargo alert/ sort of thing).

As this question has come up a few times, it might be worth a group EPrints chat about support for:
- complex search config in OAI-PMH (will need consideration about performance)
- inclusion of 'NOT' search operators (might need to include IFNULL logic)


-----Original Message-----
From: Dennis Müller [mailto:dennis.mueller@uni-mannheim.de]
Sent: 17 February 2023 13:16
To: eprints-tech@ecs.soton.ac.uk
Subject: Search filters: negation or searching for "either X or NULL"

Hi everyone,

I'm having a hard time creating the correct filter for a custom OAI set
and I was hoping for some help here.

The set is based on a field which can be either TRUE, FALSE or NULL. It
should contain only records where the value is not TRUE (alternatively
where it's either FALSE or NULL).

I have not found out how to negate/invert a filter, so I could search
for "not TRUE". Neither have I managed to filter for "has value of FALSE
or has no value at all".

The snipped below excludes all NULLs, presumably because it treats
"NULL" as a string here.

    meta_fields => [ "my_field" ],
    value => "FALSE NULL",
    match => "IN",
    merge => "ANY"

AFAIK, multiple filters are always combined with logical AND. Is there a
way to change this to "AND NOT" or "OR"?

I'm sure this is possible somehow and I'll feel quite stupid once
someone points it out. :)

Many thanks in advance and best regards

Dennis Müller, B.A.

Universität Mannheim
Digitale Bibliotheksdienste | Schloss Schneckenhof West | 68131 Mannheim

Tel: +49 621 181-3023
- dennis.mueller@uni-mannheim.de (Persönlich)
- alma.ub@uni-mannheim.de (Alma, Primo, Systembibliothekarisches)
- support.ub@uni-mannheim.de (PC-Support)

Web: www.bib.uni-mannheim.de

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

--- End Message ---