Tech List

[index] [options] [help]
See the Mailing Lists Page for how to subscribe and unsubscribe.

eprints_tech messages

Please note: this page shows emails that have been sent to the eprints_tech mailing list. Some of these may be spam emails we have failed to filter.

[EP-tech] storage capacity

From: "Alison Sutton" <a.m.sutton AT reading.ac.uk>
Date: Tue, 24 Jun 2008 12:40:04 +0100


Threading:      • This Message
             Re: [EP-tech] storage capacity from r.davis AT ulcc.ac.uk
             Re: [EP-tech] storage capacity from C.J.Keene AT sussex.ac.uk
             Re: [EP-tech] storage capacity from roman.chyla AT gmail.com
             Re: [EP-tech] storage capacity from roman.chyla AT gmail.com

*** 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NCe459ee3a4d236017d3c016a4c14c30db5c-006201c8d5ef$0b479600$1e65e186%40rdghome.ad.rdg.ac.uk%3E
*** EPrints community wiki - http://wiki.eprints.org/

Hi 

For the purposes of a proposal for a repository at Reading Uni. we need to
estimate the storage capacity needed for items to be deposited in EPrints. 

Please can anyone advise how best to calculate this? I've been asked for
typical storage for different categories of full text publications - journal
articles, conference proceedings, books, book chapters and PhD theses.

In less specific terms, some examples of storage capacity per large numbers
of research publications would also be useful.

Thanks
Alison


-------------------------------------------------------------

Alison Sutton, Librarian	Tel: +44(0)118 378 7984
Department of Meteorology	Fax: +44(0)118 378 8905
University of Reading		www.met.rdg.ac.uk/Library
Earley Gate, PO Box 243		email: a.m.sutton AT rdg.ac.uk
Reading RG6 6BB

My working hours are Tuesdays to Fridays, 09:30 - 15:00.

------------------------------------------------------------- 


Re: [EP-tech] storage capacity

From: "Richard M. Davis" <r.davis AT ulcc.ac.uk>
Date: Tue, 24 Jun 2008 14:12:06 +0100


Threading: [EP-tech] storage capacity from a.m.sutton AT reading.ac.uk
      • This Message

This is a multi-part message in MIME format.
Alison Sutton wrote:
> Please can anyone advise how best to calculate this? I've been asked for
> typical storage for different categories of full text publications - 
journal
> articles, conference proceedings, books, book chapters and PhD theses.

Hi Alison

This question is a close relative of that one about the string ;)

I think it really is just guesswork, as there are so many variables.

Thinking only of single PDF type submissions: a short document might 
still be a large file if it is full of images, formulas or heavy 
formatting; a long document might be a small file if it isn't. The size 
range will probably be from 100KB to 10MB.

IMO you might as well just estimate 1 file = 1MB, and therefore that 
100GB disk space will hold 100,000 files. What you actually get will be 
somewhere between one-tenth and ten-times that: a negligible degree of 
imprecision!

But somewhere with a well-established repository, like Soton ECS, might 
be able to suggest a sounder formula based on a statistical analysis of 
their holdings, perhaps broken down by type (articles, theses, etc.): if 
so, I'd love to see it too.

Hope this helps

Richard







ATTACHMENT: r_davis.vcf!


Re: [EP-tech] storage capacity

From: Chris Keene <C.J.Keene AT sussex.ac.uk>
Date: Tue, 24 Jun 2008 14:58:24 +0100


Threading: [EP-tech] storage capacity from a.m.sutton AT reading.ac.uk
      • This Message

*** 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NEwY9c8abfaf3465d0f954bda7d37a6596f3-4860FD80.4080906%40sussex.ac.uk%3E
*** EPrints community wiki - http://wiki.eprints.org/

Alison,

Like Richard says, piece of string :)


We have 1000~ records and about 450 full text (mainly articles) docs and 
the whole eprints dir comes to 897Mb. However, when we migrate, or so 
something scary we like to take a copy (in addition to the usual 
backups) of everything in case things go wrong, which of course 
instantly doubles it.

Other things to consider: amount of storage per record will probably 
increase in the future. i.e. new features such as tracking changes to 
records, or keeping the original file format along with the pdf file 
(for preservation) for each item.


Basically if the method you use produces a server spec with less than 
1TB then choose another method :)


Cheers
Chris

Alison Sutton wrote:
> *** 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NCe459ee3a4d236017d3c016a4c14c30db5c-006201c8d5ef$0b479600$1e65e186%40rdghome.ad.rdg.ac.uk%3E
> *** EPrints community wiki - http://wiki.eprints.org/
> 
> Hi 
> 
> For the purposes of a proposal for a repository at Reading Uni. we need to
> estimate the storage capacity needed for items to be deposited in EPrints. 

> 
> Please can anyone advise how best to calculate this? I've been asked for
> typical storage for different categories of full text publications - 
journal
> articles, conference proceedings, books, book chapters and PhD theses.
> 
> In less specific terms, some examples of storage capacity per large 
numbers
> of research publications would also be useful.
> 
> Thanks
> Alison
> 
> 
> -------------------------------------------------------------
> 
> Alison Sutton, Librarian	Tel: +44(0)118 378 7984
> Department of Meteorology	Fax: +44(0)118 378 8905
> University of Reading		www.met.rdg.ac.uk/Library
> Earley Gate, PO Box 243		email: a.m.sutton AT rdg.ac.uk
> Reading RG6 6BB
> 
> My working hours are Tuesdays to Fridays, 09:30 - 15:00.
> 
> ------------------------------------------------------------- 
> 

-- 
Chris Keene                                     C.J.Keene AT sussex.ac.uk
Technical Development Manager                   Tel (01273) 877950
University of Sussex Library
http://www.sussex.ac.uk/library/


Re: [EP-tech] storage capacity

From: "Roman Chyla" <roman.chyla AT gmail.com>
Date: Tue, 24 Jun 2008 16:38:25 +0200


Threading: [EP-tech] storage capacity from a.m.sutton AT reading.ac.uk
      • This Message

*** 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NFcsf2e3ff7241ef8c43df4dbaf44dbf18fc-ea0115e90806240738v946c6d1i462784375ad406dc%40mail.gmail.com%3E
*** EPrints community wiki - http://wiki.eprints.org/

On 6/24/08, Chris Keene <C.J.Keene AT sussex.ac.uk> wrote:
> ***
> 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NEwY9c8abfaf3465d0f954bda7d37a6596f3-4860FD80.4080906%40sussex.ac.uk%3E
>  *** EPrints community wiki - http://wiki.eprints.org/
>

Hi,
we have loaded some 2500 fulltexts - journal articles and the tgz
packed archive is 500MB, the database itself, with everything + 100MB.
So the actuall numbers must be higher. (note, the fulltext are just
pdf with text, not pictures)

Could somebody with really a huge collection please share their
experience with the speed of fulltext indexing/searching?

roman


>  Alison,
>
>  Like Richard says, piece of string :)
>
>
>  We have 1000~ records and about 450 full text (mainly articles) docs and
> the whole eprints dir comes to 897Mb. However, when we migrate, or so
> something scary we like to take a copy (in addition to the usual backups) 
of
> everything in case things go wrong, which of course instantly doubles it.
>
>  Other things to consider: amount of storage per record will probably
> increase in the future. i.e. new features such as tracking changes to
> records, or keeping the original file format along with the pdf file (for
> preservation) for each item.
>
>
>  Basically if the method you use produces a server spec with less than 1TB
> then choose another method :)
>
>
>  Cheers
>  Chris
>
>
>  Alison Sutton wrote:
>
> > ***
> 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NCe459ee3a4d236017d3c016a4c14c30db5c-006201c8d5ef$0b479600$1e65e186%40rdghome.ad.rdg.ac.uk%3E
> > *** EPrints community wiki - http://wiki.eprints.org/
> >
> > Hi
> > For the purposes of a proposal for a repository at Reading Uni. we 
need to
> > estimate the storage capacity needed for items to be deposited in 
EPrints.
> > Please can anyone advise how best to calculate this? I've been asked 
for
> > typical storage for different categories of full text publications -
> journal
> > articles, conference proceedings, books, book chapters and PhD 
theses.
> >
> > In less specific terms, some examples of storage capacity per large
> numbers
> > of research publications would also be useful.
> >
> > Thanks
> > Alison
> >
> >
> >
> -------------------------------------------------------------
> >
> > Alison Sutton, Librarian        Tel: +44(0)118 378 7984
> > Department of Meteorology       Fax: +44(0)118 378 8905
> > University of Reading           www.met.rdg.ac.uk/Library
> > Earley Gate, PO Box 243         email: a.m.sutton AT rdg.ac.uk
> > Reading RG6 6BB
> >
> > My working hours are Tuesdays to Fridays, 09:30 - 15:00.
> >
> >
> -------------------------------------------------------------
> >
>
>  --
>  Chris Keene                                     C.J.Keene AT sussex.ac.uk
>  Technical Development Manager                   Tel (01273) 877950
>  University of Sussex Library
>  http://www.sussex.ac.uk/library/
>
>


Re: [EP-tech] storage capacity

From: "Roman Chyla" <roman.chyla AT gmail.com>
Date: Tue, 24 Jun 2008 16:38:25 +0200


Threading: [EP-tech] storage capacity from a.m.sutton AT reading.ac.uk
      • This Message

*** 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NFdkee1a2f6b623e0de124b6db9dd614a26a-ea0115e90806240738v946c6d1i462784375ad406dc%40mail.gmail.com%3E
*** EPrints community wiki - http://wiki.eprints.org/

On 6/24/08, Chris Keene <C.J.Keene AT sussex.ac.uk> wrote:
> ***
> 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NEwY9c8abfaf3465d0f954bda7d37a6596f3-4860FD80.4080906%40sussex.ac.uk%3E
>  *** EPrints community wiki - http://wiki.eprints.org/
>

Hi,
we have loaded some 2500 fulltexts - journal articles and the tgz
packed archive is 500MB, the database itself, with everything + 100MB.
So the actuall numbers must be higher. (note, the fulltext are just
pdf with text, not pictures)

Could somebody with really a huge collection please share their
experience with the speed of fulltext indexing/searching?

roman


>  Alison,
>
>  Like Richard says, piece of string :)
>
>
>  We have 1000~ records and about 450 full text (mainly articles) docs and
> the whole eprints dir comes to 897Mb. However, when we migrate, or so
> something scary we like to take a copy (in addition to the usual backups) 
of
> everything in case things go wrong, which of course instantly doubles it.
>
>  Other things to consider: amount of storage per record will probably
> increase in the future. i.e. new features such as tracking changes to
> records, or keeping the original file format along with the pdf file (for
> preservation) for each item.
>
>
>  Basically if the method you use produces a server spec with less than 1TB
> then choose another method :)
>
>
>  Cheers
>  Chris
>
>
>  Alison Sutton wrote:
>
> > ***
> 
http://www.eprints.org/tech.php/id/%3CEMEW-k5NCe459ee3a4d236017d3c016a4c14c30db5c-006201c8d5ef$0b479600$1e65e186%40rdghome.ad.rdg.ac.uk%3E
> > *** EPrints community wiki - http://wiki.eprints.org/
> >
> > Hi
> > For the purposes of a proposal for a repository at Reading Uni. we 
need to
> > estimate the storage capacity needed for items to be deposited in 
EPrints.
> > Please can anyone advise how best to calculate this? I've been asked 
for
> > typical storage for different categories of full text publications -
> journal
> > articles, conference proceedings, books, book chapters and PhD 
theses.
> >
> > In less specific terms, some examples of storage capacity per large
> numbers
> > of research publications would also be useful.
> >
> > Thanks
> > Alison
> >
> >
> >
> -------------------------------------------------------------
> >
> > Alison Sutton, Librarian        Tel: +44(0)118 378 7984
> > Department of Meteorology       Fax: +44(0)118 378 8905
> > University of Reading           www.met.rdg.ac.uk/Library
> > Earley Gate, PO Box 243         email: a.m.sutton AT rdg.ac.uk
> > Reading RG6 6BB
> >
> > My working hours are Tuesdays to Fridays, 09:30 - 15:00.
> >
> >
> -------------------------------------------------------------
> >
>
>  --
>  Chris Keene                                     C.J.Keene AT sussex.ac.uk
>  Technical Development Manager                   Tel (01273) 877950
>  University of Sussex Library
>  http://www.sussex.ac.uk/library/
>
>


[index] [options] [help]