EPrints Technical Mailing List Archive

Message: #05881


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Antwort: Re: Digital Preservation in EPrints


I asked the National Archives about the PRONOM risk scores for formats.  The use of these is documented and commented out in the code of  the EPrints Preservation Plugin.

Although PRONOM has the potential to add risk scores, they are all blank now, and it is not something that they are looking into adding. 

 

Tomasz

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Tomasz Neugebauer
Sent: August-17-16 5:32 PM
To: eprints-tech@ecs.soton.ac.uk
Cc: Francisco Berrizbeitia <francisco.berrizbeitia@concordia.ca>
Subject: Re: [EP-tech] Antwort: Re: Digital Preservation in EPrints

 

I have been going through the installation of the DROID and Preservation Toolkit plugins over the last few days. 

It was difficult to figure out, so I thought I would share a summary of what I learned about these plugins, and how I got them to work:

 

DROID

Bazaar: http://bazaar.eprints.org/143/

GitHub: https://github.com/eprintsug/droid

Prerequisites: Java 1.6 or higher

 

What it does / how I got it to work:

 

On activation, it is supposed to download the DROID 4 tar file from here:

http://freefr.dl.sourceforge.net/project/droid/droid/4.0.0/droid-4.0.0-linux.tar.gz

Then untar it into /lib/bin/DROID

All of this failed without any error message on my EPrints 3.3.12

The bazaar package said it installed OK, but it didn't report the fact that it was unable to complete the required steps.

There is a message on the list about File::Move vs File::Copy::Recursive::rmove, but I couldn’t get this work (http://www.eprints.org/tech.php/thread-16264.html )  Instead, I manually download tar file, and untar it (using command line) manually into /lib/bin/DROID/ folder. 

The plugin also adds some cron events for updating the DROID_SignatureFile.xml and running the scan - I think this part is working.  I was also able to update the signature file using the command line:

java -jar /lib/bin/DROID/droid.jar -d /lib/bin/DROID/DROID_SignatureFile.xml

 

=======================

 

PRESERVATION Toolkit

Bazaar: http://bazaar.eprints.org/142/

Github: https://github.com/eprintsug/preservation_toolkit  

Prerequisites: DROID

Some documentation: http://www.eprints.org/software/training/3.2/admin/filerisks_tutorial.php  

 

What it does / how I got it to work:

 

It is supposed to provide Editors with a Format/Risks button that would list the count of documents and their corresponding format types in their repository.  

After plugin install, the button didn’t show up on my repository, because the can_be_viewed permission on line 45 of FormatRisks.pm didn’t exist in my EPrints, so I changed line 45 of FormatsRisks.pm to return $self->allow( "config/view" );  That got me a button.  Clicking on it, at first, it said that I had no objects in the repository, along with a new button: “Request File Type Recount” appears.  Either by pushing this button, or on plugin activation (I’m not sure), a cron event is added which went through the repository and a results table with two categories: 1) High Risk Objects – these are all the UNKNOWN (DROID found no classification match) 2) Format Breakdown – list of format types and how many there is of each.  It would be great if it provided a button to know/list which documents are high risk – there is mention of this in the docs (a “plus” button), but I didn’t see this working.  Has anyone figured out how to get the “plus” button or something like it, so that I can quickly find out which documents belong to the “high risk” category?

In the documentation and the code, there is mention of classification into “low”, “medium” and “high” risk, but this is not working.   There are a number of reasons for that.  First, the “update_risk_scores()” call on line 23 of Update_Pronom_File_Counts is commented out (as is the whole function).  This is the function that uses SOAP::Lite to query PRONOM at NationalArchives for risk scores associated with each format.  Since this is actually commented out in the plugin, I see no reason to install SOAP::Lite.    Second, and this part I found most confusing: it looks to me like PRONOM still doesn’t have any risk scores associated with any format types in its database (is that correct?)– so it may be pointless to try to activate this part of the plugin.  PRONOM allows you to query for risk scores (see: http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=new ) but if you search, you will see that all formats have a blank risk score.   The documentation for the plugin talks about an “unstable” risk score retrieval set up for testing at EPrints, and used to generate screenshots for the docs/presentations. 

 

My apologies for the long message; if you have read all of this, and want to correct something  or add some information, it would be very much appreciated. 

 

Best wishes,

 

Tomasz

 

 

 

________________________________________________

Tomasz Neugebauer
Digital Projects & Systems Development Librarian / Bibliothécaire des Projets Numériques & Développement de Systèmes
Library / Bibliothèque
Concordia University / Université Concordia

Tel. / Tél. 514-848-2424 ext. / poste 7738
Email / courriel:
tomasz.neugebauer@concordia.ca

Mailing address / adresse postale: 1455 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8
Street address / adresse municipale: 1400 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8

library.concordia.ca
concordia.ca 
Twitter:
https://twitter.com/photomediathink


Description: Concordia-NewLogo-EMAIL

 

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of martin.braendle@id.uzh.ch
Sent: May-31-16 7:49 AM
To:
eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Antwort: Re: Digital Preservation in EPrints

 

Hi Tomasz,

the command line version of DROID 6.x does not support the FileCollection XML report as it was created by DROID 4 and used by the Preservation Toolkit Bazaar package, see also discussion on https://groups.google.com/forum/#!topic/droid-list/odOGT7ccn2I

I have somewhere on my disk the tar file for DROID 4. Please contact me off-list if you want to have it. It runs with Java 1.6 or higher - we run it with Java 1.8.

DROID 4 is indeed outdated. We noted that even with the most recent PRONOM signature files, it does not recognize the format of about 4% of our PDF files, while spot checks revealed that DROID 6 does recognize the format.

It is still on my todo list (in the course of the SUK P-2 project Digital Life Cycle Management) to make the preservation toolkit compatible with DROID 6.

Best regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich

mail: martin.braendle@id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505
http://www.zi.uzh.ch

Inactive hide details for Adam Field ---31/05/2016 12:32:25---The preservation toolkit is really quite old now.  It it’s imporAdam Field ---31/05/2016 12:32:25---The preservation toolkit is really quite old now.  It it’s important, perhaps there’s some community

Von: Adam Field <Adam.Field@jisc.ac.uk>
An: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Datum: 31/05/2016 12:32
Betreff: Re: [EP-tech] Digital Preservation in EPrints
Gesendet von: eprints-tech-bounces@ecs.soton.ac.uk





The preservation toolkit is really quite old now.  It it’s important, perhaps there’s some community effort that can be directed at it.  I’m happy to assist as much as my current job allows.
 
 

cid:image002.png@01D1F71E.D1F61520

Adam Field
SHERPA services analyst developer

 
 
From: <eprints-tech-bounces@ecs.soton.ac.uk> on behalf of Tomasz Neugebauer <Tomasz.Neugebauer@concordia.ca>
Reply-To:
"
eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Date:
Thursday, 26 May 2016 21:16
To:
"
eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Subject:
[EP-tech] Digital Preservation in EPrints

 
To use the Digital Preservation Toolkit (http://bazaar.eprints.org/142/), DROID (http://bazaar.eprints.org/143/) is required. DROID runs on Java.
The DROID bazaar plugin mentions “DROID v.4”
Meanwhile current version of DROID is on Version 6.2.1 (http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/)
Given that we are running EPrints 3.3.12, what is the recommended setup for this?
Should we try to find DROID v4 or can we run the latest version of DROID?  
If we need DROID 4, where do we get that?  Also, what version of Java does that require?
Current version of DROID  requires a minimum of Java 6 Standard Edition (SE), built and tested on Java 1.6 update 30.
 
Tomasz
 
 


Jisc is a registered charity (number 1149740) and a company limited by guarantee which is registered in England under Company No. 5747339, VAT No. GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203 697 5800.

Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited by guarantee which is registered in England under company number 2881024, VAT number GB 197 0632 86. The registered office is: One Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive:
http://www.eprints.org/tech.php/
*** EPrints community wiki:
http://wiki.eprints.org/
*** EPrints developers Forum:
http://forum.eprints.org/