[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: RIS plugin problems (utf8 and journal title)




On 13/11/15 13:29, George Mamalakis wrote:
> OK! Which means that it is only available for the roman alphabet? That's
> odd!

 From the documentation:

The characters allowed in the reference ID fields can be in the set "0" 
through "9," or "A" through "Z."
The characters allowed in all other fields can be in the set from 
"space" (character 32) to character 255 in the ANSI Character Set. Note, 
however, that the asterisk (character 42) is not allowed in the author, 
keywords or periodical name fields. "

Also note:

Each tag and its contents must be on a separate line, preceded by a 
carriage return/line feed? (ANSI 13 10).


>
> Anyway, since most sites allow RIS exports for utf8 encoded characters,
> it wouldn't harm if the import plugin supported them as well.
>
> Thanks for the info though!
>
> On 13/11/2015 02:13 ??, Ian Stuart wrote:
>> RIS format is, by specification, ASCII not UTF8
>>
>>
>> On 13/11/15 10:12, George Mamalakis wrote:
>>> Hello everybody,
>>>
>>> I tried to use the RIS import plugin from:
>>> http://files.eprints.org/741/. The plugin wouldn't accept the
>>> publication field from Google scholar exported entries, nor would it
>>> allow UTF8 encoded strings to be imported (both problems have been
>>> spotted from the web import functionality). So, I tried to resolve them
>>> myself, and I found the following corrections that seem to solve the
>>> problems.
>>>
>>>
>>> diff -r d5f969263300 perl_lib/EPrints/Plugin/Import/RIS.pm
>>> --- a/perl_lib/EPrints/Plugin/Import/RIS.pm     Fri Nov 06 11:22:06 2015
>>> +0200
>>> +++ b/perl_lib/EPrints/Plugin/Import/RIS.pm     Fri Nov 13 11:57:54 2015
>>> +0200
>>> @@ -34,7 +34,6 @@
>>>          my( $plugin, %opts ) = @_;
>>>          my @ids;
>>>          my $fh = $opts{fh}; # File handle
>>> +  binmode( $fh, ":utf8" );
>>>          my @file = <$fh>;
>>>          my ( %record, @records ) = ();
>>>          my $lastkey = undef;
>>> @@ -237,9 +236,6 @@
>>>          # Publication title
>>>          &_join_multiple_field_data($epdata, $entry, ['T2', 'JF'],
>>> 'publication', ', ');
>>> +  &_join_multiple_field_data($epdata, $entry, ['T2', 'JO'],
>>> 'publication', ', ');
>>>          # Series title
>>>          &_join_field_data($epdata, $entry, 'T3', 'series', ', ');
>>>
>>> What I've done was to change the binmode of the file (borrowed from
>>> BibTeX import plugin) to accept utf8 encoded strings, and I've added one
>>> more entry for the publication field (journal title if I'm not mistaken)
>>> to be based on JO rather than JF (which is how scholar returns it).
>>>
>>> I am sending these changes to:
>>>
>>> a) help anyone having the same problems with the specific plugin,
>>> b) ask if these corrections are correct :), and
>>> c) also to ask what is the proper procedure of reporting these "bugs" so
>>> they'll be corrected permanently (eg. contact the maintainer directly,
>>> indirectly, what?).
>>>
>>> Thanks all for your answers in advance,
>>>
>>> George.
>>>
>>>
>
>

-- 

Ian Stuart.
Developer: ORI, RJ-Broker, and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/

This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.