EPrints Technical Mailing List Archive

Message: #07044


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] EPrints Database entity relationship diagram


Hi. Long story short the database represents a set of datasets. eg. EPrint, Document, User. Each of these has a primary table containing the main record and any singular fields, eg title of an eprint. Then there is one table for every "multiple" field, eg. subjects. Compound multiple fields have one table for each field due to an implementation detail. eg creators is really stored as two multiple fields. creators_id and creators_name

In addition, each dataset has an ordervalues and an index table. The index table contains the words to match in full text searches. The ordervalues table contains a cached version of each field flattened into a string to use for sorting by that field.

Some eprint fields link to other objects, eg. eprint has a userid of the depositing user, document has the eprintid to which the document belongs (there's quite a few of these)

Finally there's cache tables which are used so that it doesn't repeat a search when you go to page two of the results.

Some of the database features aren't taken advantage of because it would tie EPrints to one backend and some sites want to use other SQL databases.

Deep in the admin pages there's this function https://eprints.soton.ac.uk/cgi/users/home?screen=DatabaseSchema which gives you a general idea of what database tables are. Some tables may be created by plugins that EPrints core doesn't know about.


On 14/12/2017 16:45, Alan.Stiles wrote:
I've just tried creating a model of our instance from the database tool we use.  It took 20 pages just to draw the database tables, though 5 pages of that were probably cache tables.  That's without drawing any links between the tables, or leaving enough space to discern which line goes from where to where...
We have something like 60 tables that all start eprints_* e.g. creators_name, creators_id, editors_name...  I really don't want to try drawing that ERD alone, never mind including the user table or the REF2014 plugin tables!

I once got asked for similar by our central IT team (possibly "Do you have a list of fields in the database?") Once I started explaining the volume of stuff they were asking about they decided they had a much more specific query to ask!

Alan

-----Original Message-----
From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of David R Newman
Sent: 14 December 2017 16:10
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] EPrints Database entity relationship diagram

Hi Andrew,

I am not aware of one and they would vary between repositories sufficiently significantly that you would need a tool to produce one rather than having one ER diagram for all repositories.  You may be able to generate something comparable from the XML schema at:

https://eprints.lincoln.ac.uk/cgi/schema

It may also be possible to run a tool against the database directly.
  Something like http://search.cpan.org/dist/SQL-Translator/script/sqlt-
diagram will probably do the job.  However, I do not know how well the relationships between tables (i.e. foreign keys) have been defined.  So  you may just see a load of unassociated tables.  As the linking of tables is managed for the Perl-coded EPrints API.

Regards

David Newman

On Thu, 2017-12-14 at 15:48 +0000, Andrew Beeken wrote:
Hi all!

Is there an entity relationship diagram available for the EPrints
database? I’ve been asked by our databases team if I can provide one
and I don’t particularly relish the idea of drawing one up myself!

Thanks in advance!
Andrew



The University of Lincoln, located in the heart of the city of
Lincoln, has established an international reputation based on high
student satisfaction, excellent graduate employment and world-class
research.

The information in this e-mail and any attachments may be
confidential. If you have received this email in error please notify
the sender immediately and remove it from your system. Do not disclose
the contents to another person or take copies.

Email is not secure and may contain viruses. The University of Lincoln
makes every effort to ensure email is sent without viruses, but cannot
guarantee this and recommends recipients take appropriate precautions.

The University may monitor email traffic data and content in
accordance with its policies and English law. Further information can
be found at: http://www.lincoln.ac.uk/legal.
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-
tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority in relation to its secondary activity of credit broking.

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/

--
Christopher Gutteridge -- http://users.ecs.soton.ac.uk/cjg

University of Southampton Open Data Service: http://data.southampton.ac.uk/
You should read our Web & Data Innovation blog: http://blogs.ecs.soton.ac.uk/webteam/