[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] OAI Harvesting

CAUTION: This e-mail originated outside the University of Southampton.
Hi All,

We're setting up RT2 (Elements) at the moment and working through some bugs. This is not a specific EPrints problem, but I'm hoping the collective wisdom of those here can provide some clarity...

In our OAI ListSets pages it has become apparent that we have duplicate sets. We appear to have a peculiar setup whereby we have :

$oai->{sets} = [
{ id=>"person", allow_null=>0, fields=>"contributors_id/editors_id/department" }

This puts department in the person set. We don't even use department in our current EPrints records (we have Divisions which I've spoken about a LOT previously). What I'm curious about is:

1) How do duplicate sets come about? I thought the idea of a set would be if items have the same value they would be in the same set.

2) Is there any easy way to identify the duplicate sets? Somebody from Symplectic that I'm working with was kind enough to point them out on our live repository and sure enough if I ctrl+f for "Molecular and Clinical Pharmacology" on https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2%3Fverb%3DListSets&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbbba74ab541542b39c1708d9dc12b090%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637782795663536863%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Vh4wWsuM7FTLzCN1xNAmJaumEdYnPs4BjALsInl8iUU%3D&reserved=0 it appears twice.

I've tried to learn about OAI, but it does unfortunately make my brain scream because I just do not understand it properly.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20220120/11dcc15b/attachment.html