[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] OAI2 Harvesting Problem



Hi James,

No problem.? It sounds like the new error you see from your third-party 
application is likely a dropped connection.? It is worth having a look 
at the webserver error logs to see if this provides any more 
information, to set your mind at rest for why the connection got 
unexpectedly dropped.

Regards

David Newman

On 23/04/2021 11:18, James Kerwin wrote:
> *CAUTION:* This e-mail originated outside the University of Southampton.
> Hi David,
>
> Apologies for the delayed response and thank you for the advice. I 
> THINK what I've done is introduced trouble by not including the 
> subjects in the EPrints database when I uploaded the new Divisions 
> structure. The reason the search page went wonky is because it 
> displays the subjects in a list. My assumption is that when it 
> couldn't find them in the database it threw an error. Or I'm totally 
> wrong. I did leave the Subjects in the eprint_fields.pl 
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprint_fields.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825637881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2Tow7d0909syKtQ7hU7iL9nq9oyJQrThEAjIFaP21mo%3D&amp;reserved=0> 
> file, ironically to avoid causing too much trouble.
>
> I managed to alter the OAI config to disregard subjects and that 
> brought it back to life. Then on the next harvest it failed due to the 
> warnings?pasted below. It did then work on a subsequent attempt, which 
> is good news!
>
> "Error saving an xml document: Unable to read data from the transport 
> connection: An existing connection was forcibly closed by the remote host.
> System.IO.IOException: Unable to read data from the transport 
> connection: An existing connection was forcibly closed by the remote 
> host. ---> System.Net.Sockets.SocketException: An existing connection 
> was forcibly closed by the remote host
> ? ?at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, 
> Int32 size, SocketFlags socketFlags)"
>
> Putting the failed attempt down to some sort of random disconnect as 
> it was harvesting for 45 minutes before it broke.
>
> Anyway, once again thanks for your help.
>
> James
>
>
>
> On Thu, Apr 8, 2021 at 6:06 PM David R Newman <drn at ecs.soton.ac.uk 
> <mailto:drn at ecs.soton.ac.uk>> wrote:
>
>     Hi James,
>
>     Subjects (and to a lesser extent divisions) have always been an
>     integral part of EPrints.? Generally removing them from workflows,
>     citations and config files like cfg/cfg.d/eprints_render.pl
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprints_render.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825647871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=HqyGfob5VwhZeUzHlgd2xD%2B3gGfA%2BYdQ3d4IAv0BRzo%3D&amp;reserved=0>,
>     cfg/cfg.d/eprint_search_advanced.pl
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprint_search_advanced.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825647871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=GL3sKbySdIzV1NyY9mpoEj3dnUGFi3PRQv3H%2BXiEqJo%3D&amp;reserved=0>,
>     cfg/cfg.d/views.pl
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fviews.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825647871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=jyj2NyaC8FvlAJCgEOXir4Dpg%2FyEpSfjvjhU2TDmj1w%3D&amp;reserved=0>,
>     etc. is sufficient to hide them without breaking EPrints.? If you
>     start removing them from cfg/cfg.d/eprint_fields.pl
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprint_fields.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825647871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xBTTV5mi8Y7w1fV7YPPkjrLAa7MkNXtNL43Zv3m9MoE%3D&amp;reserved=0>
>     is when you are likely to hit problems like those you mention, as
>     certain aspects of EPrints expect the subjects or divisions fields
>     to at least be defined even if they are not used.
>
>     I think in this particular (/cgi/oai2) situation, you probably
>     needed to make sure you disabled the subjects OAI set in
>     cfg/cfg.d/oai.pl
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Foai.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825647871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8tdeDzvg9IVVzwSjcY1112zRsQAIQexIVc%2FSJyFbEqA%3D&amp;reserved=0>.
>     Unfortunately, there is a rather complex list of config changes
>     you need to be sure to make if you want to undefine (i.e. comment
>     out / remove) the subjects field in eprint_fields.pl
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprint_fields.pl%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825647871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xBTTV5mi8Y7w1fV7YPPkjrLAa7MkNXtNL43Zv3m9MoE%3D&amp;reserved=0>
>     and make sure this does not break anything else.? If I get the
>     opportunity, I will see if there is a suitable place on the wiki
>     to document what this rather complex list of config changes is.
>
>     Regards
>
>     David Newman
>
>     On 08/04/2021 17:05, James Kerwin via Eprints-tech wrote:
>>     *CAUTION:* This e-mail originated outside the University of
>>     Southampton.
>>     Hi All,
>>
>>     Update on this after some mooching around. Our "ListSets" option
>>     in OAI2 no longer works and sends me to an error page:
>>
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2%3Fverb%3DListSets&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=P7AM67kuZHi3O%2Fc04BKvPQizZDIZDicLx%2BWFcVSFtWs%3D&amp;reserved=0
>>     <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2%3Fverb%3DListSets&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=P7AM67kuZHi3O%2Fc04BKvPQizZDIZDicLx%2BWFcVSFtWs%3D&amp;reserved=0>
>>
>>     When I redid the Divisions/Uni Structure recently in EPrints the
>>     Subjects were removed as we haven't used them in years. This
>>     caused a couple of issues with the advanced search and some
>>     abstract pages. Looking at the ListSets page for another
>>     repository, the Subjects are part of this.
>>
>>     I'm going to attempt to correct it by either altering the
>>     ListSets (probably not) or by resurrecting the Subjects. I'll
>>     update with any success/failure.
>>
>>     Hopefully I can prevent other would-be unintentional wreckers
>>     following in my footsteps.
>>
>>     Thanks,
>>     James
>>
>>     On Thu, Apr 8, 2021 at 3:58 PM James Kerwin
>>     <jkerwin2101 at gmail.com <mailto:jkerwin2101 at gmail.com>> wrote:
>>
>>         Hi All,
>>
>>         Hope everyone is happy and healthy.
>>
>>         Our repository is harvested by a company named EBSCO.
>>         Recently they have started receiving the following warning
>>         and failing to harvest:
>>
>>         "Harvest has been aborted by an error "Could not harvest from
>>         https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EmlZMfNiSdOcVUHBnlV2XrvcioVSyxtmG%2F7WOYqmwPE%3D&amp;reserved=0
>>         <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EmlZMfNiSdOcVUHBnlV2XrvcioVSyxtmG%2F7WOYqmwPE%3D&amp;reserved=0>:
>>         The remote server returned an error: (500) Internal Server Error"
>>
>>         They use this base URL:
>>
>>         https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EmlZMfNiSdOcVUHBnlV2XrvcioVSyxtmG%2F7WOYqmwPE%3D&amp;reserved=0
>>         <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flivrepository.liverpool.ac.uk%2Fcgi%2Foai2&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EmlZMfNiSdOcVUHBnlV2XrvcioVSyxtmG%2F7WOYqmwPE%3D&amp;reserved=0>
>>
>>         This whole area is slightly off my radar so I was hoping if
>>         there are any common things I could check? Obviously the
>>         repository is up and running. I've asked for the dates of the
>>         most recent successful harvest and the first failure as well
>>         as if it is still happening. I also need to speak with our
>>         Computing Services Department to check if the IPs can get
>>         through the firewall.
>>
>>         Is there anything else I can/should check based on all of
>>         your collective experience?
>>
>>         We did recently get a new security certificate, but I can't
>>         imagine that is a problem as we do this each year without any
>>         issues.
>>
>>         Thanks,
>>         James
>>
>>
>>     *** Options:http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech  <http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech>
>>     *** Archive:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=g0Wzeql35rNleqTiGRxssu%2FZ42CeSfaZDeinF2f3z%2Bo%3D&amp;reserved=0  <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825657867%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=g0Wzeql35rNleqTiGRxssu%2FZ42CeSfaZDeinF2f3z%2Bo%3D&amp;reserved=0>
>>     *** EPrints community wiki:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825667859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9j7DTjYOYvoVNghp8xGAOYZ9Lo2Rb9hHkiqnnyJnqR4%3D&amp;reserved=0  <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825667859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9j7DTjYOYvoVNghp8xGAOYZ9Lo2Rb9hHkiqnnyJnqR4%3D&amp;reserved=0>
>
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825667859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=otN08t3T6NeoFLFS0dfRwXBnj5fuRy6OOVoKYa3PSew%3D&amp;reserved=0>
>     	Virus-free. https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825667859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Msucirc030FOBaqTOHbJG6PeQPn9L0W2thHc2%2BacBkA%3D&amp;reserved=0
>     <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825667859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=otN08t3T6NeoFLFS0dfRwXBnj5fuRy6OOVoKYa3PSew%3D&amp;reserved=0>
>
>


-- 
This email has been checked for viruses by AVG.
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avg.com%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbed1961ce0e04a7a970108d90641c627%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637547701825667859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=u2JZ5bAAHoWBl%2B4ukCp1fdov3QDJ7T%2FesbjV65ASsmE%3D&amp;reserved=0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20210423/c60bc4d8/attachment-0001.html