EPrints Technical Mailing List Archive

See the EPrints wiki for instructions on how to join this mailing list and related information.

Message: #09649


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] 0 byte file uploads


CAUTION: This e-mail originated outside the University of Southampton.

Dear all,

 

thanks for your pointers (the hypothesis by John on cloud storage being interesting) . I’ll follow them up and implement those.

Another hypothesis that came up during our Scrum meeting was that the upload problems occur more frequently since our university switched to another VPN software.

We have EPrints 3.3.16 . We had disabled upload by URL years ago because of known problems.

Kind regards,

 

Martin

 

--

Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Pfingstweidstrasse 60B
CH-800
5 Zürich

 

 

From: Liam Green-Hughes <L.E.Green-Hughes@kent.ac.uk>
Date: Wednesday, 28 February 2024 at 10:59
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>, Martin Brändle <martin.braendle@uzh.ch>
Subject: Re: [EP-tech] 0 byte file uploads

HI all,

 

We have seen this issue too. I believe it is to do with a file entry being created in the database, but nothing copied to the actual document filesystem. If you create a file of zero length (e.g. with the touch command) and try to upload it to an Eprints instance you should be able to see this in action (from memory). In our repository I added a warning message into document_validate.pl. Not sure how people end up with zero length files, it could be something to do with PDF file generation or Eprints getting upset a invalid characters in filenames (it doesn't like apostrophes much). 

 

Thanks

Liam

 

 


From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> on behalf of John Salter <J.Salter@leeds.ac.uk>
Sent: 28 February 2024 09:20
To: eprints-tech@ecs.soton.ac.uk <eprints-tech@ecs.soton.ac.uk>; Martin Brändle <martin.braendle@uzh.ch>
Subject: RE: [EP-tech] 0 byte file uploads

 

Some people who received this message don't often get email from j.salter@leeds.ac.uk. Learn why this is important

CAUTION: This email originated from outside of the organisation. Do not click links or open attachments unless you recognise the sender and know the content is safe.

 

CAUTION: This e-mail originated outside the University of Southampton.

CAUTION: This e-mail originated outside the University of Southampton.

Hi Martin, David,

I have observed this too in 3.3.16 – and written a cronjob to alert me to new cases in case there was a pattern.

 

My hunch was that it related to people uploading from cloud storage – where a file appears as though it’s local to the user’s computer, but the files aren’t actually cached locally. As yet, I haven’t managed to get a proper failing test case.

 

I have put some warnings in place as part of my document_validate to catch these – although it sounds like these will not be needed when we upgrade.

 

Cheers,

John

 

From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> On Behalf Of David R Newman
Sent: Wednesday, February 28, 2024 9:06 AM
To: eprints-tech@ecs.soton.ac.uk; Martin Brändle <martin.braendle@uzh.ch>
Subject: Re: [EP-tech] 0 byte file uploads

 

CAUTION: External Message. Use caution opening links and attachments.

Hi Martin,

I am aware of this issue and we believe it many cases we think it is down to how the _javascript_ works in the uploader, mainly we believe with drag-n-drop.  My colleague has rewritten this, as modern web browser no longer need the _javascript_ currently used and we attend to add it to the next major release of EPrints.

I did implement something to warn if there is a file that reports as zero bytes (i.e. the document file's filesize is 0.  It is in the second commit for:

https://github.com/eprints/eprints3.4/issues/189 (changeset: https://github.com/eprints/eprints3.4/commit/f03b80da02b319d59705144ecccdc933b91c99e5)

This GitHub issue was admittedly originally focussed on what I believed was another reason behind zero-byte files.  That a user would try to upload from a URL they had access to but the EPrints repository did not (e.g. private IP or site that required password or similar authentication).  However, the second commit was solely focussed on putting up a warning message after the upload if this failed to complete successfully.  This was implemented in EPrints 3.4.4, which version of EPrints are you running?  It would be useful to know if it works as expected for you, as this is such an intermittent issue it is has been difficult to test.  However, it should warn if the filesize for one of a document's file is 0.  Unfortunately, it may not do this as soon as the upload fails but at very least this should appear in the same place as non-field specific warnings (e.g. a bespoke validation that requires field field A or filed B to be set), so should be picked up before the user clicks deposit or otherwise during the review process.

Regards

David Newman

On 28/02/2024 8:40 am, Martin Brändle wrote:

CAUTION: This e-mail originated outside the University of Southampton.

CAUTION: This e-mail originated outside the University of Southampton.

Dear all,

 

in our repository, we have found a few PDFs that are 0 bytes long (actually, it’s a 0.05 per mille problem).

We are not sure how this has happened – we don’t think that there are problems with the drive (it’s mirrored), rather we think that the problem originates from the user’s side, e.g. that something happened at upload or the file was already faulty on the user’s drive.

 

Indeed, it’s possible to upload a file with 0 bytes length to EPrints without any problem as we had tested.

 

However, I think this should be checked by the file uploader and a warning should be issued to the user. This seems not to be implemented yet.

 

Kind regards,

 

Martin

 

--

Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Pfingstweidstrasse 60B
CH-8005 Zürich

 

 

*** EPrints community wiki: https://wiki.eprints.org/