EPrints Technical Mailing List Archive

Message: #09523

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Latest list archives available again

Hi all,

Just a quick update on this.  I have now added a search feature for the list's archive.  This is available at:


or linked from the Archive link in the signature footer for the list.

Just so people are aware, the list archive and its search index are only updated overnight (UTC).  So any emails sent on a particular day will not appear in the archive until the following day.


David Newman

On 10/01/2024 23:45, David R Newman wrote:
Hi all,

I have finally managed to get the list archives updating again with the latest messages.  They are available at:


The archive URL in the signature of mailing list emails (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.eprints.org%2Ftech.php&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7C2458be331492485ce61608dc15d7e7a6%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638409264240882714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=E%2BvZfxxkzH1yb7ieopDg3gVA7lszf1kQNWAeu87MZ%2FQ%3D&reserved=0) also redirects to this URL.

I have spent some time trying to make the mail archive a bit more aesthetically pleasing and improve its UI a bit.  The only significant issue I have encountered is that since the mailing list has started being sanitized with safelinks.protection.outlook.com redirects, these appear to get broken when archive files are converted into HTML pages.  The text for the link you can see on the page is fine (so you could copy that and paste in your address bar) but if you click on the link rather than being redirected to the actually link via the safelinks.protection.outlook.com redirects you get stuck on the redirect site.  This is because everything after the "url=" part goes missing during the conversion to HTML pages.  I am no sure if this is a built-in protection from the HTML page generator (MHonArc [1]), as it may be thought it was trying to obfuscate the true URL or whether the URL unintentionally fails to match the regular expression, as there is a "https" in the middle of the URL.  I am going to continue to look into this but I don't think there is a straightforward resolution, so it may take a while to fix.

I have been having some thought about other things I can do to improve the mail archive.  I have already broken it down into separate pages of 100 messages each rather than all in one page. However, it still does not make it that easy to find an email from a particular date.  Even though I have made sure dates are now included for each message on the listing pages, so you can at least get some idea where you are in the listing.  To make it easy to find emails, I was thinking of effectively making a separate archive for each year.  This would mean there would only be a few pages per year, (rather than almost 100 in total), making it easier to find the page where a message for a particular date in that year may reside.  This does cause threads to break but it is rather uncommon to have threads that straddle years.

The other thing that I know people have said was useful on the old threader.ecs.soton.ac.uk was the ability to search across all the list archive.  I think this will require a separate solution to the HTML pages generated using MHonArc.  I am going to look for something simple that can consume the HTML pages for individual messages (produced by MHonArc) and produce an index, I can then put an search form in front of.


David Newman

[1] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mhonarc.org%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7C2458be331492485ce61608dc15d7e7a6%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638409264240882714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TbHoseLz50vIpKHyJ1dSm6KhGtc6TB%2B4HpU4KnGowM0%3D&reserved=0

*** Options: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FEprints-tech_Mailing_List&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7C2458be331492485ce61608dc15d7e7a6%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638409264240882714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5PxWNcALMFnMeCxJr0Simu6TZ%2BNVLbEBQgYPZ6TZAjM%3D&reserved=0
*** Archive: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7C2458be331492485ce61608dc15d7e7a6%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638409264240882714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YasKolyHSStHlBAcP1Ux2rO2VzkqsxAdygGIr7QG8dg%3D&reserved=0
*** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2F&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7C2458be331492485ce61608dc15d7e7a6%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638409264240882714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rOdAj%2BptjSxzEMbMrAwAwTRR6tjQ4P2OAff%2B6%2B%2FG48o%3D&reserved=0