[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Fixity Check and EPrints - Digital Preservation



I believe that EPrints stores a checksum value for each uploaded file, but as far as I understand, there is no way to monitor if the checksums match up with current file, and thus no way of checking for bit rot.

DSpace has the following: https://wiki.duraspace.org/display/DSDOC6x/Validating+CheckSums+of+Bitstreams

A periodic fixity check is a part of the lowest level of support for digital preservation, i.e., "Bit-level".  See some examples of Digital Preservation policy, all of which have some variation on this as a requirement:"regularly audit checksums to ensure that no files have corrupted or changed in any way. This practice ensures the ability to provide an exact copy of original files over time":

?         https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf "Regularly perform fixity checks on AIPs"

?         https://digital.library.yorku.ca/documentation/fixity-procedures "York University Library are committed to maintaining the integrity of objects in its care. This includes creating checksums for all archival format objects -- plus associated datastreams -- ingested into the repository, and regular fixity checking of those objects"

?         https://researchworks.lib.washington.edu/policy-preservation.html "Maintains the authenticity of the bitstream through integrity checking"

I understand that EPrints is primarily an open access platform, but I think that we should be able to provide at least the lowest "bit-level" digital preservation support with it, and without a Fixity check, I don't think we can ensure that no files are corrupted or changed over time.

Preservation Metadata for Institutional Repositories<http://preserv.eprints.org/papers/presmeta/pm-paper-draft.html>, a report looking at EPrints and digital preservation dating back to 2007 states the following about Fixity checking "Where is fixity check first performed? Not within EPrints currently, but a script that crawls the archive comparing files with checksums is possible".  We are now 10 years later, and I am wondering if and how institutions running EPrints are implementing their Fixity checks? Are you using an external tool like this: https://www.avpreserve.com/tools/fixity/? Are you using your own custom script?  Did you develop something that is integrated with the EPrints Admin interface?


Best wishes,
Tomasz




________________________________________________
Tomasz Neugebauer
Digital Projects & Systems Development Librarian / Biblioth?caire des Projets Num?riques & D?veloppement de Syst?mes
Library / Biblioth?que
Concordia University / Universit? Concordia
Tel. / T?l. 514-848-2424 ext. / poste 7738
Email / courriel: tomasz.neugebauer at concordia.ca<mailto:tomasz.neugebauer at concordia.ca>
Mailing address / adresse postale: 1455 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8
Street address / adresse municipale: 1400 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8
http://library.concordia.ca<http://library.concordia.ca/>
http://www.concordia.ca/faculty/tomasz-neugebauer.html


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20170824/e9af0583/attachment.html