EPrints Technical Mailing List Archive

Message: #02735


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: UTF-8 issues on BibTeX import?


Reading strings?

Have you tried

  $count = utf8::upgrade($name)

see http://perldoc.perl.org/utf8.html

(I tried all sorts of things over the years... and I don't think I've been consistent)

On 10/03/14 15:31, Andrew Beeken wrote:
Interesting!

Looking into this a bit further, the issue seems to be around the keys
that records take with them out of, say, a Scopus export. For example, a
record may be given a key of Péron20141; note the accent - this is the
part that¹s causing the issue and is probably understandable if the key is
conforming to specific standards. With this in mind, is there a workaround?

On 10/03/2014 11:24, "Ian Stuart" <Ian.Stuart@ed.ac.uk> wrote:

On 10/03/14 11:02, Andrew Beeken wrote:
Me again!

Another issue that has been flagged up by our admin users is that a
BibTeX import will fall over when it encounters accented characters
in an author name. I¹ve already flagged a problem with UTF-8 encoding
in output in another email and I¹m wondering if there is a similar
fix here?

Something to consider (I fell over this) is that web servers have a
tendency to not actually sent UTF-8, even when you ask them to....

I have a script that wouldn't render the name of some Dutch university
correctly..... but when I added in the name of a chinese one, it was fine.

It was a blinkin' NIGHTMARE to figure out.... and in the end I bypassed
the EPrints output, and just "printed" directly, with the line

    binmode(STDOUT, ":utf8");

in my code.


--

Ian Stuart.
Developer: ORI, RJ-Broker, and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.