[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Limit Export-search-results (max_items for export)



Hi Avischai,

Unfortunately, I don't think there is a way of limiting the number of 
records that can be exported.? I think the consideration at the time was 
that browse view web pages with loads of items can take a long time to 
load (even when cached) and they are not particularly useful to a user 
with their web browser as the page will be really long, (i.e. take 
forever to scroll through).? So rather than putting load on the server 
to generate such a web page it easier just to say, "this page has too 
many items to display".? The opposite is true with exports, which are 
typically machine-readable and therefore either used for some automated 
analysis or post-processed (e.g. truncated to only the first n items) 
before being displayed to a real user.? If an export itself was 
truncated or restricted if it had what was determined "too many items", 
this would then prevent or render the analysis/post-processing useless.? 
I am not sure what other people's thoughts are about this?

I think I may appreciate what might be your more general point, which is 
the high processing cost of generating these large exports.? If you have 
some crawler going through your browse views and asking for every export 
format for some of these really long listings of items, it can put quite 
some load on the server, (/cgi/exportview is not cached).? Sometimes, 
there can be multiple connections (maybe even 20+) from the same IP 
address trying to request view listing exports. ? I have observed 
crawlers doing this on a number of EPrints repositories and have had to 
resort to blocking the IP addresses, at least temporarily.? We have been 
considering for future version of EPrints, if there is a way of 
restricting the number of requests that can make for processor-intensive 
pages over a set period of time:

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints3.4%2Fissues%2F102&data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C32307e8079944604dc7b08da5e96d54c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637926299150600639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mCi2qarMgO9Jw%2FojhWLM%2Bb2VPRzVleTQTr9LdMA6vN8%3D&reserved=0

Regards

David Newman

On 05/07/2022 3:28 pm, Stenger, Avischai via Eprints-tech wrote:
> CAUTION: This e-mail originated outside the University of Southampton.
>
> Hi,
>
> I can limit the "max of founded Records" with ?max_items? in views.pl , but it looks like there is no limit for "export founded records?
>
> So as I search after ?roman? and get the message "The number of items (7) for this view has exceeded system limits (6). The system administrator either needs to increase "max_items" or apply additional filters to this view.?
>
> I can still klick on this Message-page on ?export? and get all the records. Is there a way to limit the permitted size (count)  of records for the export?
>
>
> Regards & Tnks
>
> *** Options:http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C32307e8079944604dc7b08da5e96d54c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637926299150600639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=THy%2FtwUpZEaC2Ml1BnY4z4YiA2j3T1iB2TMcBZdgCOY%3D&reserved=0
> *** EPrints community wiki:https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=05%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C32307e8079944604dc7b08da5e96d54c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637926299150600639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Wbg1pqB2pFkn1l1WKcABwMI7tS2%2F6hTKQFSjSyXizQ%3D&reserved=0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20220705/2e37a469/attachment-0001.html