[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] archive statistics



Sorry, that was not off list as I intended.? Nothing confidential but 
only really interesting if you like file listings or development sagas.

On 22/06/2021 10:16, David R Newman via Eprints-tech wrote:
> Hi Yuri (off list),
>
> Here is the listing for the ingredient.? As you can see belowalthough
> there are a few files you should be able to map these into you local
> archive or your current lib or site_lib directories of EPrints (i.e. one
> path or another not some in lib, some in site_lib and some in your
> archive).? I think we want to avoid making it a Bazaar plugin, as then
> it becomes something that requires ongoing maintenance and potential
> releases of new versions, which is a bit excessive for something just
> quickly whip up as a useful analysis tool really only intended for an
> the original developer's use.
>
> Regards
>
> David Newman
>
> total 24
> drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 bin
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 cgi
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 lang
> -rw-rw-r-- 1 eprints eprints 3715 Jun 12 11:28 src.index.html
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 plugins
> -rw-rw-r-- 1 eprints eprints? 995 Jun 12 11:28 readme.txt
> [eprints at demo disk_report]$ ls -ltR
> .:
> total 24
> -rw-rw-r-- 1 eprints eprints? 995 Jun 12 11:28 readme.txt
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 plugins
> -rw-rw-r-- 1 eprints eprints 3715 Jun 12 11:28 src.index.html
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 lang
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 cgi
> drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 bin
>
> ./plugins:
> total 4
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 EPrints
>
> ./plugins/EPrints:
> total 4
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 Plugin
>
> ./plugins/EPrints/Plugin:
> total 4
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 Screen
>
> ./plugins/EPrints/Plugin/Screen:
> total 4
> drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 Staff
>
> ./plugins/EPrints/Plugin/Screen/Staff:
> total 4
> -rw-rw-r-- 1 eprints eprints 1036 Jun 12 11:28 EPrintDiskReport.pm
>
> ./lang:
> total 4
> drwxrwxr-x 3 eprints eprints 4096 Jun 12 11:28 en
>
> ./lang/en:
> total 4
> drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 phrases
>
> ./lang/en/phrases:
> total 4
> -rw-rw-r-- 1 eprints eprints 445 Jun 12 11:28 disk_report.xml
>
> ./cgi:
> total 4
> drwxrwxr-x 2 eprints eprints 4096 Jun 12 11:28 disk_report
>
> ./cgi/disk_report:
> total 12
> -rw-rw-r-- 1 eprints eprints 655 Jun 12 11:28 data.json
> -rw-rw-r-- 1 eprints eprints 656 Jun 12 11:28 show
> -rw-rw-r-- 1 eprints eprints 743 Jun 12 11:28 data.csv
>
> ./bin:
> total 20
> -rwxrwxr-x 1 eprints eprints? 170 Jun 12 11:28 get_fs_data
> -rwxrwxr-x 1 eprints eprints? 580 Jun 12 11:28 new_report
> -rwxrwxr-x 1 eprints eprints? 454 Jun 12 11:28 get_db_data
> -rwxrwxr-x 1 eprints eprints? 827 Jun 12 11:28 get_db_fdata
> -rwxrwxr-x 1 eprints eprints 2134 Jun 12 11:28 combine
>
>
>
> On 22/06/2021 10:04, Yuri via Eprints-tech wrote:
>> CAUTION: This e-mail originated outside the University of Southampton.
>>
>> wow, great work!
>>
>> I'm still on 3.3 and I just need this info for a migration, to get some
>> data to put in a report.
>>
>> Il 22/06/21 10:49, David R Newman via Eprints-tech ha scritto:
>>> Hi Yuri,
>>>
>>> I was just saying to my colleague Justin, I wonder if you were a plant
>>> (to ask this question).  He has just been working on a tool for this
>>> very purpose, so that we can analyse repositories that are running out
>>> of disk space to see if they genuinely need more space or if things
>>> could be tidied up to free up sufficient space.  The tool is a bit rough
>>> around the edges but he is happy to make it available as a 3.4
>>> ingredient in an EPrints GitHub repository, when he has had a chance to
>>> tidy it up (over the next few days).   If you are still on 3.3 it may be
>>> possible to map the various files into directories in your archive and
>>> enable as a plugin but that is not something either Justin or I have
>>> tried to do.
>>>
>>> The tool uses a cronjob to produce monthly disk reports rather than a
>>> live status.  If there is a wider interest in the tool we could look to
>>> making the tool customisable to allow a greater reporting frequency.
>>> Unfortunately, it is not just a case of running the cron job more
>>> frequently, although the adaptation required to the tool as a whole
>>> should be fairly minor.
>>>
>>> I have deployed this rough version on tryme.demo.eprints-hosting.org if
>>> you want to take a look.  The disk reports are only available under the
>>> Admin menu, so you would need to give me the username you used for the
>>> account you create on tryme, so I can up this account to a repository
>>> admin one.
>>>
>>> Regards
>>>
>>> David Newman
>>>
>>> On 22/06/2021 07:52, Yuri via Eprints-tech wrote:
>>>> CAUTION: This e-mail originated outside the University of Southampton.
>>>>
>>>> Hi!
>>>>
>>>>       what is the best way to get archive statistics, like how many records
>>>> in the archive, or to have some size hint on them (for example to find
>>>> how many objects uses at least 10MB of space, 20MB and so on), how much
>>>> total space the archive use (for example only record on archive status),
>>>> maybe grouping them by type (for example thesys uses 10GB, articles uses
>>>> 40GB and so on)?
>>>>
>>>> I've done rough statistics using du and some unix tools but I would like
>>>> to refine them better.
>>>>
>>>>
>>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C8b410d004794411c69a808d9355f08fe%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599504548395421%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2FxrR0iCesXsK%2FCi9Qg6NrFHJwQ2thTHXy2PHpjl4lRM%3D&reserved=0
>>>> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C8b410d004794411c69a808d9355f08fe%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599504548395421%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ieE1Gm%2F06N2U7gLfxOk6uiLBq5XYDjvr1oA637Xjskg%3D&reserved=0
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C8b410d004794411c69a808d9355f08fe%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599504548395421%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2FxrR0iCesXsK%2FCi9Qg6NrFHJwQ2thTHXy2PHpjl4lRM%3D&reserved=0
>> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C8b410d004794411c69a808d9355f08fe%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599504548395421%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ieE1Gm%2F06N2U7gLfxOk6uiLBq5XYDjvr1oA637Xjskg%3D&reserved=0

-- 
This email has been checked for viruses by AVG.
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.avg.com%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C8b410d004794411c69a808d9355f08fe%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637599504548395421%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gW3b96KaMDv%2BHxSQgZ9U%2FnlNAP%2B7rPTLAZghWEZA4NA%3D&reserved=0