[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: IRStats2 - Family names of creators not displaying with appropriate capital letters



Hi,

Actually I'm getting different results on my machine (using the 
name-parse module) but it's even worse that what you showed me.


So I've extracted some code from the "lingua-name-case" module and this 
seems to work well on your examples.

The patch is there: 
https://github.com/eprints/irstats2/commit/ab76be0f2c72d753c76f5f9f772ec4612c1c7937

The complete file (Stats/Sets.pm) is there: 
https://raw.githubusercontent.com/eprints/irstats2/master/cfg/plugins/EPrints/Plugin/Stats/Sets.pm

Could you give this a try on your data and let me know if it fixed your 
issues?

Thanks,
Seb.




On 28/05/14 15:30, Centro de Documentaci?n wrote:
> Hi Seb,
>
> Thanks for your help.
>
> I've tested graingert's suggestion...
>
> It works well in general, but can't deal with this situation:
>
> (before -without any patch-) Alvaro de lazaro, Mart?n => (after)
> Alvaro De Lazaro, Mart?n (-correct form- Alvaro de Lazaro, Mart?n)
> Gennero de romero, Andrea in?s => Gennero De Romero, Andrea In?s
> (Gennero de Romero, Andrea In?s)
> De la rosa, Julia mar?a => De La Rosa, Julia Mar?a  (de la Rosa, Julia Mar?a)
>
> Well, I'll see what can I do
>
> Thanks again,
>
>
> On Tue, May 27, 2014 at 10:21 AM, Sebastien Francois
> <sf2 at ecs.soton.ac.uk> wrote:
>> I a reference to this:
>> http://stackoverflow.com/questions/19396804/capitalizing-strings-which-contain-accented-characters
>>
>> But I don't have much to look into this at the moment... It seems to be
>> about an internal regex (to the module above) which fails to detect
>> proper word boundaries.
>>
>> graingert (on github) was suggesting using another perl module - perhaps
>> worth giving that one a try?
>>
>> Seb.
>>
>> On 23/05/14 18:16, Centro de Documentaci?n wrote:
>>> Seb,
>>>
>>> It works, but not always accurate. I have a problem with capital
>>> letters after accent marks in family and given names.
>>>
>>> Before applying the patch "lingua"
>>> Alvarez cema, Juan alberto
>>>
>>> After (Ok)
>>> Alvarez Cema, Juan Alberto
>>>
>>>
>>> Before applying the patch
>>> Gonz?lez carella, Mar?a in?s
>>>
>>> After (Wrong)
>>> Gonz?Lez Carella, Mar?A In?S
>>>
>>> Should be
>>> Gonz?lez Carella, Mar?a In?s
>>>
>>> Any suggestion?
>>>
>>> Regards,
>>>
>>>
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/