[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] OAI2 Harvesting Problem



Hi James,

Subjects (and to a lesser extent divisions) have always been an integral 
part of EPrints.? Generally removing them from workflows, citations and 
config files like cfg/cfg.d/eprints_render.pl, 
cfg/cfg.d/eprint_search_advanced.pl, cfg/cfg.d/views.pl, etc. is 
sufficient to hide them without breaking EPrints.? If you start removing 
them from cfg/cfg.d/eprint_fields.pl is when you are likely to hit 
problems like those you mention, as certain aspects of EPrints expect 
the subjects or divisions fields to at least be defined even if they are 
not used.

I think in this particular (/cgi/oai2) situation, you probably needed to 
make sure you disabled the subjects OAI set in cfg/cfg.d/oai.pl.? 
Unfortunately, there is a rather complex list of config changes you need 
to be sure to make if you want to undefine (i.e. comment out / remove) 
the subjects field in eprint_fields.pl and make sure this does not break 
anything else. If I get the opportunity, I will see if there is a 
suitable place on the wiki to document what this rather complex list of 
config changes is.

Regards

David Newman

On 08/04/2021 17:05, James Kerwin via Eprints-tech wrote:
> *CAUTION:* This e-mail originated outside the University of Southampton.
> Hi All,
>
> Update on this after some mooching around. Our "ListSets" option in 
> OAI2 no longer works and sends me to an error page:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2%3Fverb%3DListSets&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=60Sz94zaGs1xbUWM7DwyLr%2FShqOX0lWs2fYb10HqIbQ%3D&reserved=0 
> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2%3Fverb%3DListSets&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=60Sz94zaGs1xbUWM7DwyLr%2FShqOX0lWs2fYb10HqIbQ%3D&amp;reserved=0>
>
> When I redid the Divisions/Uni Structure recently in EPrints the 
> Subjects were removed as we haven't used them in years. This caused a 
> couple of issues with the advanced search and some abstract pages. 
> Looking at the ListSets page for another repository, the Subjects are 
> part of this.
>
> I'm going to attempt to correct it by either altering the ListSets 
> (probably not) or by resurrecting the Subjects. I'll update with any 
> success/failure.
>
> Hopefully I can prevent other would-be unintentional wreckers 
> following in my footsteps.
>
> Thanks,
> James
>
> On Thu, Apr 8, 2021 at 3:58 PM James Kerwin <jkerwin2101 at gmail.com 
> <mailto:jkerwin2101 at gmail.com>> wrote:
>
>     Hi All,
>
>     Hope everyone is happy and healthy.
>
>     Our repository is harvested by a company named EBSCO. Recently
>     they have started receiving the following warning and failing to
>     harvest:
>
>     "Harvest has been aborted by an error "Could not harvest from
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nurRWYJ0%2Bx7kCekZ0W5yrqdXEO5oFn37%2FmXWC%2FTBT6k%3D&amp;reserved=0
>     <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nurRWYJ0%2Bx7kCekZ0W5yrqdXEO5oFn37%2FmXWC%2FTBT6k%3D&amp;reserved=0>:
>     The remote server returned an error: (500) Internal Server Error"
>
>     They use this base URL:
>
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nurRWYJ0%2Bx7kCekZ0W5yrqdXEO5oFn37%2FmXWC%2FTBT6k%3D&amp;reserved=0
>     <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nurRWYJ0%2Bx7kCekZ0W5yrqdXEO5oFn37%2FmXWC%2FTBT6k%3D&amp;reserved=0>
>
>     This whole area is slightly off my radar so I was hoping if there
>     are any common things I could check? Obviously the repository is
>     up and running. I've asked for the dates of the most recent
>     successful harvest and the first failure as well as if it is still
>     happening. I also need to speak with our Computing Services
>     Department to check if the IPs can get through the firewall.
>
>     Is there anything else I can/should check based on all of your
>     collective experience?
>
>     We did recently get a new security certificate, but I can't
>     imagine that is a problem as we do this each year without any issues.
>
>     Thanks,
>     James
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Q9vLeYcHXGkuNFscaAKBrVzuluO%2B9%2FrBwhcFtmlbtyQ%3D&amp;reserved=0
> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=B1ItKeHN177W7ykQy1CuXHE3iLcUJXk0shDuV3j%2FFmk%3D&amp;reserved=0


-- 
This email has been checked for viruses by AVG.
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avg.com%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C7d52c052a977434914c208d8fab08d7c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637534984419445739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=bxEd5Kqbw1TzZ5Q44D0YzBlFSyDVJjqTZG4o402Fj0A%3D&amp;reserved=0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20210408/ec1209d6/attachment.html