EPrints Technical Mailing List Archive

Message: #06995


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Antwort: Re: validation on upload field


In GitHub, there is this:

https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/optional_filename_sanitise.pl

which works alongside an addition to System.pm:

https://github.com/eprints/eprints/blob/69f4c9e581df137b970ce0ab4e08572976162411/perl_lib/EPrints/System.pm#L551-L559

 

That might be useful to know about?

 

Cheers,

John

 

From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of martin.braendle@id.uzh.ch
Sent: 30 November 2017 13:56
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Antwort: Re: validation on upload field

 

Hi Alfredo,

another way, instead of validating, is to transcribe the filenames. We extended the sanitise subroutine in perl_lib/EPrints/System.pm like this:

Index: System.pm
===================================================================
--- System.pm (revision 1405)
+++ System.pm (revision 1406)
@@ -25,6 +25,7 @@
 use strict;
 use File::Copy;
 use Digest::MD5;
+use Text::Unidecode;
 
 =item $sys = EPrints::System->new();
 
@@ -540,6 +541,10 @@
  $filepath = Encode::decode_utf8( $filepath )
  if !utf8::is_utf8( $filepath );
 
+ # UZH CHANGE ZORA-542 2016/12/21/mb
+ $filepath = unidecode( $filepath );
+ $filepath =~ s![\x20]!_!g;
+
  # control characters + Win32 restricted
  $filepath =~ s![\x00-\x0f\x7f<>:"\\|?*]!_!g;



Best regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich

mail: martin.braendle@id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505
http://www.zi.uzh.ch

Inactive hide details for th.lauke---30.11.2017 13:52:25---Hi Alfredo, we solved an similar feature request by a either repositth.lauke---30.11.2017 13:52:25---Hi Alfredo, we solved an similar feature request by a either repository specific (i.e. Eprints/archi

Von: th.lauke@arcor.de
An: eprints-tech@ecs.soton.ac.uk
Datum: 30.11.2017 13:52
Betreff: Re: [EP-tech] validation on upload field
Gesendet von: eprints-tech-bounces@ecs.soton.ac.uk





Hi Alfredo,

we solved an similar feature request by a either repository specific (i.e. Eprints/archives/repoID/cfg/cfg.d/) or server specific (i.e. Eprints/site_lib/cfg.d/) document_validate.pl:

$c->{validate_document} = sub
{
       my( $document, $repository, $for_archive ) = @_;

       my @problems = ();

       my $xml = $repository->xml();

       # default checks
# :
       # site-specific checks

       # check for proper filename, i.e. accepted by tivoli backup ingesting only ASCI-filenames without blanks
       # print STDERR "main: ", $document->value( "main" )," escaped: ",URI::Escape::uri_escape_utf8($document->value( "main" ), "^A-Za-z0-9\-\._~\/");
       my $doc_name_uri = URI::Escape::uri_escape_utf8($document->value( "main" ), "^A-Za-z0-9\-\._~\/");
       if( $document->value( "main" ) ne $doc_name_uri )
       {
               my $fieldname = $repository->make_element( "span", class=>"ep_problem_field:documents" );
               $fieldname->appendChild( $document->dataset->render_name( $repository ) );

               my $prob = $repository->make_doc_fragment;
               $prob->appendChild( $repository->html_phrase( "validate:bad_filename", fieldname=>$fieldname ) );
               $prob->appendChild( $repository->make_text( $doc_name_uri ) );

               $prob->appendChild( $repository->html_phrase( "validate:original_filename") );
               $prob->appendChild( $repository->make_text( $document->value( "main" ) ) );

               push @problems, $prob;
       }


       return( @problems );
};

After setting the introduced phrases by
<epp:phrase id="validate:bad_filename">Please replace non-ASCII characters (e.g. 'äöü') or blanks in the name of uploaded <epc:pin name="fieldname" /> appropriately to simplify future handling!<br/>Following filename prepared for repository<br/></epp:phrase>
<epp:phrase id="validate:original_filename"><br/>is different to original one :(<br/></epp:phrase>
in an appropriate .../lang/en/phrases/... file it should work :-)

Hth
Thomas

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/