[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] OAI harvesting / records moving from live to buffer (or other non-deletion datasets).

I'm trying to work out an elegant solution to this issue:
EPrint is made live
EPrint is harvested over OAI-PMH
EPrint is moved to non-live dataset (e.g. buffer - or in this specific case dark-archive)
EPrint is no longer available publicly, but an OAI-PMH harvest will not see the record as deleted - and will not therefore remove it.

I've checked with OAI-PMH gurus, and they think that just flagging the record as deleted will be OK - if the record subsequently reappears, it should get re-harvested OK.

I think that the solution for this is to add a filter to the OAI-PMH searches that looks for EPrints with a datestamp (when the item was first made live), but that aren't in the 'archive' dataset.
To achieve this methods in EPrints::OpenArchives (that currently check for 'deletion' status) will need to be tweaked, and filters for 'has datestamp' added to cgi/oai2 OR $c->{oai}->{filters}.

Has anyone else come across this issue and found an elegant solution - or can see any issues with this proposal?