[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] archive statistics



Hi Yuri,

I was just saying to my colleague Justin, I wonder if you were a plant 
(to ask this question).? He has just been working on a tool for this 
very purpose, so that we can analyse repositories that are running out 
of disk space to see if they genuinely need more space or if things 
could be tidied up to free up sufficient space.? The tool is a bit rough 
around the edges but he is happy to make it available as a 3.4 
ingredient in an EPrints GitHub repository, when he has had a chance to 
tidy it up (over the next few days).?? If you are still on 3.3 it may be 
possible to map the various files into directories in your archive and 
enable as a plugin but that is not something either Justin or I have 
tried to do.

The tool uses a cronjob to produce monthly disk reports rather than a 
live status.? If there is a wider interest in the tool we could look to 
making the tool customisable to allow a greater reporting frequency.? 
Unfortunately, it is not just a case of running the cron job more 
frequently, although the adaptation required to the tool as a whole 
should be fairly minor.

I have deployed this rough version on tryme.demo.eprints-hosting.org if 
you want to take a look.? The disk reports are only available under the 
Admin menu, so you would need to give me the username you used for the 
account you create on tryme, so I can up this account to a repository 
admin one.

Regards

David Newman

On 22/06/2021 07:52, Yuri via Eprints-tech wrote:
> CAUTION: This e-mail originated outside the University of Southampton.
>
> Hi!
>
>    what is the best way to get archive statistics, like how many records
> in the archive, or to have some size hint on them (for example to find
> how many objects uses at least 10MB of space, 20MB and so on), how much
> total space the archive use (for example only record on archive status),
> maybe grouping them by type (for example thesys uses 10GB, articles uses
> 40GB and so on)?
>
> I've done rough statistics using du and some unix tools but I would like
> to refine them better.
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C44cbf6cdf9864b3030bc08d9355aac86%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599485817543946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TJOImXLEC2oG5rJa0CsNQQR8Pls9fu3ktJHDWXjopnI%3D&reserved=0
> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C44cbf6cdf9864b3030bc08d9355aac86%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599485817543946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TM%2Fu1YQxVnvjug0t2%2FZ3u0W5geizEzfn5rDtxsnpJtE%3D&reserved=0

-- 
This email has been checked for viruses by AVG.
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avg.com%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C44cbf6cdf9864b3030bc08d9355aac86%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599485817543946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=5iZqdsuevDFtTy4vq9jyNSNoSRc8GCo7%2FOY88ffyDms%3D&reserved=0