EPrints Technical Mailing List Archive

Message: #08383


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Antwort: Re: Antwort: Re: perl module update introduced some trouble with entities


This was a bit of a fiddle to make it possible to do things like &pound; &eacute; etc.  to make people's lives a little easier when writing the templates which are XHTML.

The obvious other approach would be to preprocess them with something like this

s/&([a-z]+);/expandentitiy($1)/ge

but that would break with the wisdom that you should never parse XML with a regular _expression_.


On 01/12/2020 14:37, David R Newman via Eprints-tech wrote:

Hi all,

I have been blind.  EPrints (at least latest 3.4) already has an entities.dtd in lib/ and is already used in most of the standard XML template, phrase, etc. files.  I think the problem is that it does not link in properly in most if not all cases.  So I will investigate how that can be done better to avoid encountering undefined entities errors.

Regards

David Newman

On 01/12/2020 14:26, martin.braendle@uzh.ch wrote:
CAUTION: This e-mail originated outside the University of Southampton.

The entities file we have here has the following preamble

<!-- Portions (C) International Organization for Standardization 1986
     Permission to copy in any form is granted for use with
     conforming SGML systems and applications as defined in
     ISO 8879, provided this notice is included in all copies.
-->

and contains more than 500 lines.

It stems most probably from here: https://www.w3.org/TR/REC-html40-971218/sgml/entities.html

Kind regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich

Inactive hide details for "David R Newman"
              ---01/12/2020 15:19:06---Hi all, EPrints 3.4 has has the
              patch applied for issues wi"David R Newman" ---01/12/2020 15:19:06---Hi all, EPrints 3.4 has has the patch applied for issues with newer versions of

Von: "David R Newman" <drn@ecs.soton.ac.uk>
An: eprints-tech@ecs.soton.ac.uk
Kopie: th.lauke@arcor.de, martin.braendle@uzh.ch
Datum: 01/12/2020 15:19
Betreff: Re: Antwort: Re: [EP-tech] perl module update introduced some trouble with entities





Hi all,

EPrints 3.4 has has the patch applied for issues with newer versions of LibXML and EPrints 3.4.2 onwards should have this particular issue resolved.  Regarding special characters, I will look into producing (or hopefully finding) an entities.dtd for all the special characters that EPrints repositories may want use and then update standard template and phrase files to use this.  In fact it is probably even worth doing this for citation and workflow files as well.  I have created an issue for EPrints 3.4 to address this:

https://github.com/eprints/eprints3.4/issues/112

Regards

David Newman

On 01/12/2020 13:35, martin.braendle@uzh.ch wrote:

    CAUTION: This e-mail originated outside the University of Southampton.

    Hi Thomas,

    there should be  an entities.dtd file in [eprints_root]/lib/, maybe this is missing or entries are missing in it?


    Also a phrase file should mention that in the


    <!DOCTYPE phrases SYSTEM "entities.dtd">


    definition right at the beginning after the XML declaration.


    Kind regards,


    Martin


    --
    Dr. Martin Brändle
    Zentrale Informatik
    Universität Zürich
    Stampfenbachstr. 73
    CH-8006 Zürich



    Inactive hide details for "David R Newman via
                Eprints-tech" ---01/12/2020 14:24:14---Hi Thomas,
                Named HTML entities are not sup"David R Newman via Eprints-tech" ---01/12/2020 14:24:14---Hi Thomas, Named HTML entities are not supported in XML you need to use the decimal

    Von:
    "David R Newman via Eprints-tech" <eprints-tech@ecs.soton.ac.uk>
    An:
    <eprints-tech@ecs.soton.ac.uk>, <th.lauke@arcor.de>
    Datum:
    01/12/2020 14:24
    Betreff:
    Re: [EP-tech] perl module update introduced some trouble with entities
    Gesendet von:
    <eprints-tech-bounces@ecs.soton.ac.uk>





    Hi Thomas,

    Named HTML entities are not supported in XML you need to use the decimal code XML entity for &auml; which is &#228;

    This is the same as needing to replace things like &amp; and &copy; with their equivalent decimal code XML entities.

    Regards

    David Newman
     

    On 01/12/2020 12:00, th.lauke--- via Eprints-tech wrote:


    CAUTION: This e-mail originated outside the University of Southampton.

    Hi all,

    any hint where to start digging for reason(s) after following error:
    Failed to parse XML file: /usr/share/eprints/site_lib/lang/en/phrases/modified.xml: Entity: line 226: parser error : Entity 'auml' not defined

    This error occurs after updating some perl modules ... :(

    Is the 'bad' module already known?
    What is more effective: Fixing the module (version) or the phrase file?

    Thanks for any idea in advance
    Thomas

    *** Options:
    http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
    *** Archive:
    https://eur03.safelinks.protection.outlook.com/?url="">
    *** EPrints community wiki:
    https://eur03.safelinks.protection.outlook.com/?url="">

    *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
    *** Archive:
    http://www.eprints.org/tech.php/
    *** EPrints community wiki:
    http://wiki.eprints.org/




Virus-free. www.avg.com

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
-- 
Christopher Gutteridge <totl@soton.ac.uk> 
You should read our team blog at http://blog.soton.ac.uk/webteam/