[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] redirect some eprintid url to another site



Hi Yuri,


Assuming your repository is already HTTPS only, so you only need worry 
about editing ssl/securevhost.conf which has the PerlTransHandler line 
you could put in side a LocationMatch around the outside.? You can 
technically do this for HTTP configuration but it will get overwritten 
if you ever run generate_apacheconf again.


I have not ever tried putting a LocationMatch around a PerlTransHandler 
line but I am not aware of anything in Apache that would stop that 
working.? Presumably you can separately write your mod rewrite rules in 
Apache configuration to deal with the redirects you need.? However, it 
sounds like if you knew some of the redirects that would not help you 
guess the rest so I assume you would need to programmatically generate 
the mod rewrite rules.


I assume by rename the archive you mean put on another hostname. You can 
pretty much do what you said but I would make sure you set aside plenty 
of time and a backout strategy in case you have problems.? For changing 
the base URL you will need to edit the archive's cfg/cfg.d/10_core.pl.? 
Usually it is only the host and/or securehost configuration settings 
that need to be changed. I would then run all of the following scripts:


(0. epadmin test)

1. generate_apacheconf

2. generate_static

3. apachectl restart (reload is probably sufficient but just to be safe 
I tend to use restart when I change the Apache config)

4. epadmin refresh_abstracts

5. epadmin refresh_views


I would also make sure you restart the indexer via the web admin menu.? 
I cannot think of a specific reason why indexer tasks would care about a 
hostname change but probably best to be sure.


Having a look at Jon Salter's suggestion, that looks like a good 
solution.? Although still hits EPrints, so would contribute to more 
server load than being able to redirect before hitting EPrints.? 
Although, I think it is fairly negligible unless your server is always 
running hot.


Regards


David Newman




On 15/03/2022 10:28, Yuri wrote:
> *CAUTION:* This e-mail originated outside the University of Southampton.
>
> Hi David!
>
>
> ?being it almost 99% of the archive and some thousands of items, it is 
> quite difficult to have thousands lines $c->{rewrite_exceptions} but 
> seems the only possible path, being the perl handler running before 
> rewriterule. In other cases, it is possible to use LocationMatch to 
> set the default handler thus running rewrite rules.
>
>
> Another option could be rename the old archive? Thus we could use the 
> virtualhost to do just redirects, and access old items (we need them 
> internally anyway).
>
>
> Other than changing the base url, change apache configs, running 
> generate_static / generate_abstracts, what would I need to rename the 
> old archive?
>
>
> Il 14/03/22 18:00, David R Newman ha scritto:
>>
>> Hi both,
>>
>>
>> I have been doing something similar recently, albeit for abstract 
>> pages.? I prefer the brute force approach of adding to 
>> $c->{rewrite_exceptions} and them manually adding the Apache Mod 
>> Rewrite rules to an archive level file called 
>> cfg/apache_redirects.conf and then including that in? 
>> cfg/apachevhost.conf and/or ssl/securevhost.conf.? You could write a 
>> script to programmatically generate this and the cfg.d file for 
>> rewrite_exceptions from a mapping file.
>>
>>
>> I had considered doing something that would allow you to add a 
>> metadata field called redirect_url or similar that you could just 
>> edit as a user (probably an admin or editor only if the item is 
>> live), which could then be used to automatically redirect off site.? 
>> However, that would require some changes at a core level, which feels 
>> a bit excessive for tackling this problem.
>>
>>
>> One option, if you want to redirect just from abstract pages, is you 
>> could test for this new redirect_url field being set and if it is 
>> embedding some JavaScript in the abstract page that redirects to the 
>> new URL.? That is a bit hacky but makes it easier to add new items to 
>> be redirected in future rather than having to maintain a mappings 
>> list independent of the database. However, this is no use if you want 
>> to redirect document URLs.? I am not sure whether that is what you 
>> want to do?
>>
>>
>> Regards
>>
>>
>> David Newman
>>
>>
>>
>> On 14/03/2022 4:07 pm, Yuri via Eprints-tech wrote:
>>> *CAUTION:* This e-mail originated outside the University of 
>>> Southampton.
>>>
>>> Hi John!
>>>
>>>
>>> ?thanks for sharing the gist. The objects are about 10.000, so I 
>>> should load the map from a file. Unfortunately, It is an old Eprints 
>>> without the EP_TRIGGER_URL_REWRITE but I think I just can copy the 
>>> code at the begin of Rewrite.pm.
>>>
>>>
>>> Thanks!
>>>
>>>
>>> Il 14/03/22 16:35, John Salter ha scritto:
>>>> Hi Yuri,
>>>> I would use the EPrints URL Rewrite trigger.
>>>>
>>>> How many items are mapped to the other system?
>>>> Do you want to map landing page requests to one URL, and document 
>>>> requests to another URL (e.g. directly to the document in the other 
>>>> system)?
>>>>
>>>> This gist: 
>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fjesusbagpuss%2Fa5c574e1839612ef7e332d1d25edac42&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=n%2FTX6OGtubVDUh9C1qq7jRzA4CFOAelvcBnfsTfZ0vk%3D&reserved=0?allows 
>>>> you to specify the eprintid / new locations in a hash.
>>>> If all the new locations are on the same site, you could update 
>>>> line 19 to include the new base http URL, and just have the 
>>>> eprintid => otherid in the hash.
>>>>
>>>> As written, it will capture requests for anything starting with the 
>>>> EPrintID (requests for the landing page; downloads; thumbnail 
>>>> requests).
>>>> You could map these URLs individually, and change the regex match 
>>>> on line 13 to redirect document requests to the new document URL; 
>>>> landing page requests to the new landing page etc.
>>>>
>>>> Hope that helps - let me know if you need more info.
>>>>
>>>> Cheers,
>>>> John
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>> *From:* eprints-tech-bounces at ecs.soton.ac.uk 
>>>> <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Yuri via 
>>>> Eprints-tech <eprints-tech at ecs.soton.ac.uk>
>>>> *Sent:* 14 March 2022 13:45
>>>> *To:* EPrints.org Technical List <eprints-tech at ecs.soton.ac.uk>
>>>> *Subject:* [EP-tech] redirect some eprintid url to another site
>>>> CAUTION: This e-mail originated outside the University of Southampton.
>>>>
>>>> Hi!
>>>>
>>>> ? we're migrating many objects from eprints to various other 
>>>> platform. I
>>>> would like to make redirects for the URLs of this documents. For 
>>>> example
>>>> from myeprint.com/eprintid to another.site.com/otherid (I have a map
>>>> with eprintid otherurl)
>>>>
>>>> I'm trying to do it with RewriteMap and RewriteRule but Eprints define
>>>> the perl handler to manage urls (PerlTransHandler
>>>> EPrints::Apache::Rewrite) to handle rewrites. I would like not to use
>>>> cfg.d/url.pl because there are a lot of objects.
>>>>
>>>> Any idea? Should I patch Rewrite.pm to do it internally from a mapfile?
>>>> Return DECLINED? I don't know if it is worth the time, I would prefer a
>>>> simpler solution.
>>>>
>>>>
>>>> *** Options: 
>>>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.ecs.soton.ac.uk%2Fmailman%2Flistinfo%2Feprints-tech&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=nooHBM9Mb3mOJmMGdzk7ci3%2BnNJwwcCT%2BoxsMEo9wvk%3D&amp;reserved=0 
>>>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.ecs.soton.ac.uk%2Fmailman%2Flistinfo%2Feprints-tech&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=nooHBM9Mb3mOJmMGdzk7ci3%2BnNJwwcCT%2BoxsMEo9wvk%3D&amp;reserved=0>
>>>> *** Archive: 
>>>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=zcvcLL9vv%2BdzDmzLqnti7AjT%2Bk0nax%2B5RYDq%2BTXaeYg%3D&amp;reserved=0 
>>>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=zcvcLL9vv%2BdzDmzLqnti7AjT%2Bk0nax%2B5RYDq%2BTXaeYg%3D&amp;reserved=0>
>>>> *** EPrints community wiki: 
>>>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=wuwWcFsQgqVNjFIakddSvCiE6Lyzc%2BC5snTtL3JJbXo%3D&amp;reserved=0 
>>>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=wuwWcFsQgqVNjFIakddSvCiE6Lyzc%2BC5snTtL3JJbXo%3D&amp;reserved=0>
>>>
>>> *** Options:http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>> *** Archive:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=zcvcLL9vv%2BdzDmzLqnti7AjT%2Bk0nax%2B5RYDq%2BTXaeYg%3D&amp;reserved=0
>>> *** EPrints community wiki:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbd0313ef6cd448629d8408da0675912b%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637829399251566783%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=wuwWcFsQgqVNjFIakddSvCiE6Lyzc%2BC5snTtL3JJbXo%3D&amp;reserved=0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20220315/736206a9/attachment-0001.html