EPrints Technical Mailing List Archive

Message: #09136

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Main/filename not set in staging area

CAUTION: This e-mail originated outside the University of Southampton.
Hi John and David,

Thank you both for the advice and willingness to help.

It would appear the problem was "me". I fixed one problem where filenames weren't showing as we expected on one page, but this is not related to "main" not being set. This is expected behaviour and my colleagues do it via the EPrints front-end by clicking a button on the file upload workflow that allows them to "select the main file" and populates the missing fields.

Discovered our Staging.pm doesn't actually do anything. It is our "URL.pm" that has been heavily edited to provide a staging area. It kicks of adding the items by using the Document method "add_directory" which specifically states that it does not set main.

Though not a total waste since after having a good look at how it works, I'm thinking of adding some new methods that better reflect how my team use the staging area (e.g. if uploading several files at once and setting main/pos etc).


On Thu, Dec 8, 2022 at 2:28 PM John Salter <J.Salter@leeds.ac.uk> wrote:

Hi James,
Sounds like a curious issue…


What does your 'staging' actually do? Does it e.g. pull the files from a mounted filesystem (staging area)?


If possible, sharing the 'Staging.pm' code might help us diagnose.

Just checking - if the filesize is less than 4GB, does it all work as expected, but when it's over 4GB it's broken?




From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of David R Newman via Eprints-tech
Sent: 08 December 2022 14:05
To: eprints-tech@ecs.soton.ac.uk; James Kerwin <jkerwin2101@gmail.com>
Subject: Re: [EP-tech] Main/filename not set in staging area


Hi James,

Having a look at File.pm and URL.pm in 3.4.4, the only change for File.pm since the initial 3.4 import that relates to how files upload is some code that strips out trailing and leading spaces before the file is saved to disk and its filename saved to the database.  There is no such change for URL.pm. 

One thing that has been altered about file uploads to help deal with problematic filenames is in perl_lib/EPrints/Plugin/Storage/Local.pm.  This change allowed files to be renamed to <fileid>.bin on disk and EPrints would check this as a secondary filename if it could not find the file under its recorded filename in the database.  The mismatch may have been due to a special character being transposed differently in the database to the filesystem, the previous leading/trailing spaces issue or some other problem.  There is a setting ($c->{generic_filenames}) which is disabled by default but if enabled would save files to disk as <fileid>.bin in future rather than the original filename.  This was first implemented for EPrints 3.4.3.

None of what I describe above seems like it would lead to the specific issue you are reporting.  I am assuming that Staging.pm was based on File.pm at some point in the past.  If this diverged pre-3.4 then there may have been other changes between 3.3 and 3.4.


David Newman

On 08/12/2022 12:30, James Kerwin via Eprints-tech wrote:

CAUTION: This e-mail originated outside the University of Southampton.

Hi All,


Once again I've done something to our poor Data Repository. We're on EPrints 3.4.4.


If we upload files the "normal" way they go on with no problem.


If we do an upload of files over 4GB in our "Staging Area" then "Main" is not set in the Documents database table. I'm almost certain that this is something specific to my repository as we have some slightly different files in:




compared to the github for EPrints 3.4.4:



We have File.pm, URL.pm and Staging.pm. Some changes have been made to these files to enable the staging area. I've no idea how customised this is compared to other repositories. Other values appear to be getting set and the "filename" is being set in the Files table.


Before I go through a semi-destructive process of trying to set this value, is there something very obvious that I'm missing?




*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/