EPrints Technical Mailing List Archive
See the EPrints wiki for instructions on how to join this mailing list and related information.
Message: #10238
< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First
Re: [EP-tech] Alphabetically sort names with special characters
- To: eprints-tech@ecs.soton.ac.uk
- Subject: Re: [EP-tech] Alphabetically sort names with special characters
- From: Andrew M <eprints-tech@unitedgames.co.uk>
- Date: Wed, 10 Sep 2025 10:08:44 +0100
CAUTION: This e-mail originated outside the University of Southampton. You should simply be able to use the sort_values method when the list is presented. So I'm assuming off the top of my head, it's gonna be a layout issue, in one of the xml documents, and that's it. Whether that is ACTUALLY the case or not, will require me to see the problem you're seeing on my own EPrints, fix it on my own EPrints, and then share what the fix was. So let's first look at an example of where the problem is occurring. You said it was in the second level of browse views. Fantastic! I've found an example here: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Farcomabstracts.com%2Fview%2Fyear%2F2025.html&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309356968%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=WLWNXqcNCEHdRqQNqEz3p9GojjolFQqgreZKiYHxpGo%3D&reserved=0 Okay. I will replicate the same issue on my EPrints, and then look for a fix, and write again shortly. I suspect the layout xml is just cycling through the eprints and displaying them, without first sorting the eprints by author using sort_values ( https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FAPI%3AEPrints%2FMetaField%23sort_values&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309384742%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=qlrofCmDILgi%2FirneCruqoQW4Ku0FwgColU9RQc8I1E%3D&reserved=0 )...and we'll soon see if that's the case. That said, I have perhaps not read your email fully enough. Are you saying, you think it's a matter of what you have, or have not, copied over into your custom browse view? Do you have a copy of your custom browse view to share? I've just read up about gists here: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.github.com%2Fen%2Fget-started%2Fwriting-on-github%2Fediting-and-sharing-content-with-gists%2Fcreating-gists&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309411368%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=Vw5F1foM2rJmvUS7Gb6jEZAZ7N1EVJG3YbizoDGdEZw%3D&reserved=0 ...and these seem a good way to share snippets of code as needed for our discussions. Yours, Andrew. Quoting Will Hughes <w.p.hughes@reading.ac.uk>:
Andrew Oops, type in the email: "bewildered by you..." should have been "bewildered, but you...". Apologies for that! OK, I am beginning to understand but I still struggle to really see through this. In the default views.pl, which provides the sorting I would expect, I see this: { id => "creators", allow_null => 0, hideempty => 1, menus => [ { fields => [ "creators_name" ], new_column_at => [1, 1], mode => "sections", open_first_section => 1, group_range_function => "EPrints::Update::Views::cluster_ranges_30", grouping_function => "group_by_a_to_z_hideempty", }, ], order => "-date/title", variations => [ "type", "DEFAULT", ], }, But, in my custom browsing, by document type, I have this: { id=>"doctype", # Browse by type of document menus => [ { fields => [ "type" ], }, ], order => "creators_name/date", variations => [ "creators_name;first_letter", "type", "DEFAULT" ], }; So, the problem comes when my results are at a second level, rather than at a primary level of browse results. I struggle to figure out which part of the code I should be copying into my custom browse view. Best wishes Will -----Original Message----- From: Will Hughes Sent: 10 September 2025 08:34 To: eprints-tech@unitedgames.co.uk Subject: RE: [EP-tech] Alphabetically sort names with special characters Andrew Thank you so much for digging around and exploring this. I have been bewildered by you are helping me to make sense of it. Your question about where I come across the problem made me think. It is interesting as mostly happens in Browse views, but now I see that sometimes the sort is as I desire, and sometimes not - so maybe it is merely a question of how the browse view is configured: * Browse by author gives my pages (and the lists below them) "correctly" sorted, like this: A | Á-Å | B | C | Ç | D | E | F | G | H | I | İ | J | K | L | M | N | O | Ó-Ø | P | Q | R | S | Š-Ş | T | U | Ü | V | W | X | Y | Z * Browse by year gives me pages (and lists below them) sorted like this: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Ç | Ö | Ş * I also have a custom Browse by Document Type, which sorts like this: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Ç | Ó | Ö | Š So, I am going to dig around in the customised views.pl files and compare them to vanilla versions - it may simply be a question of how the order is defined. Best wishes Will -----Original Message----- From: eprints-tech-request@ecs.soton.ac.uk <eprints-tech-request@ecs.soton.ac.uk> On Behalf Of Andrew M Sent: 10 September 2025 08:06 To: eprints-tech@ecs.soton.ac.uk Subject: Re: [EP-tech] Alphabetically sort names with special characters CAUTION: This e-mail originated outside the University of Southampton. CAUTION: This e-mail originated outside the University of Southampton. Yes. It seems there is support for it already in MetaFields via the sort_values method. https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FAPI%3AEPrints%2FMetaField%23sort_values&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309436442%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=aR1ji9%2BFjpA3RdVwZWEk1m%2BFQIIObN059mxMreoBYEA%3D&reserved=0 ======= =pod =item $out_list = $field->sort_values( $in_list, $langid ) Sorts the in_list into order, based on the "order values" of the values in the in_list. Assumes that the values are not a list of multiple values. [ [], [], [] ], but rather a list of single values. =cut ======= Yours, Andrew. Quoting Andrew M <eprints-tech@unitedgames.co.uk>:CAUTION: This e-mail originated outside the University of Southampton. CAUTION: This e-mail originated outside the University of Southampton. Quoting Andrew M <eprints-tech@unitedgames.co.uk>: Since the script was getting butchered in email form, I've thrown it online here: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.andrewjamesmehta.com%2Ffiles%2Feprints%2FUnicodeSortExample.pm&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309460993%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=KlOJHsr3yP6fLSu81i%2Fh6bDHMB00CvaaaTA20vT70ks%3D&reserved=0 However, the main part was: sub unicode_sort { my $self = shift; my @configuration_to_ignore_case_and_diacritics = (level => 1); return Unicode::Collate->new(@configuration_to_ignore_case_and_diacritics)->s ort(@ARG); } As written about in the Perl Unicode cookbook: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fperldoc.perl.org%2Fperlunicook%23%25E2%2584%259E-36%3A-Case-and-accent-ins&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309487175%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=Lo4IF8BeDP72tgW%2BO0zKX8MT%2FhBRzkxhPD4%2BPOaCMhw%3D&reserved=0 ensitive-Unicode-sort This is Perl, and not EPrints of course, so the next stage is to figure out where such improved sorts need to be used in EPrints, or if there is already an option in EPrints for them.CAUTION: This e-mail originated outside the University of Southampton. CAUTION: This e-mail originated outside the University of Southampton. There was no need for the "our" before $a and $b in that code example. Apologies. Was messing around with different things and left that in. Quoting Andrew M <eprints-tech@unitedgames.co.uk>:Was intrigued by this, and had a moment of spare time, so wrote a short script, that attempts three different sorts: Default sort, Default unicode case folding case-insensitive sort, ...and since the second made no difference, I hit the online cookbook... https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fperldoc.perl.org%2Fperlunicook%23%25E2%2584%259E-36%3A-Case-and-accent-i&data=05%7C02%7Ceprints-tech%40ecs.soton.ac.uk%7Cbe886dc2424d4a2eca7308ddf049a56d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C638930924309511481%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C80000%7C%7C%7C&sdata=6fK738s7%2FSEIyw2IrV%2BX6yHqWqsjP%2FX0UbZPYsTGjBs%3D&reserved=0 nsensitive-Unicode-sort and learned about the default unicode case-and-accent-insensitive sort. So now we know how to do the correct kind of sort in Perl....next we'd need to know where in the EPrints codebase to apply the fix. Where are you seeing the wrong order appearing? In what context do you wish for the order to be changed in? Of course there may also be a simple EPrints option that switches to more correct ordering, so I probably should have checked the EPrints wiki before looking up the Perl solution. Attempting to copy and paste the short experimental script I just wrote - hope it doesn't get butchered in email form: ====================Quoting Will Hughes <w.p.hughes@reading.ac.uk>:CAUTION: This e-mail originated outside the University of Southampton. CAUTION: This e-mail originated outside the University of Southampton. Hi Hopefully a quick question with an easy answer: How do we get alphabetic sorting to list accented characters at an appropriate point in an alphabetic list? The default behaviour seems to use UniCode values or something, as accented characters appear at the end of the alphabet. For example, when I see this kind of sequence from Eprints: * Church, B * Lee, K * Ågren, R * Çınar, D I feel that it should (probably) be: * Ågren, R * Church, B * Çınar, D * Lee, K Is there a simple setting to implement sorting in a way that respects accented characters? (and will these characters reproduce accurately after emailing! Image attached just in case) Best wishes Will Will Hughes Emeritus Professor of Construction Management and Economics School of the Built Environment University of Reading, PO Box 219, Whiteknights Reading, RG6 6DF, UK
- References:
- [EP-tech] Alphabetically sort names with special characters
- From: Will Hughes <w.p.hughes@reading.ac.uk>
- Re: [EP-tech] Alphabetically sort names with special characters
- From: Andrew M <eprints-tech@unitedgames.co.uk>
- Re: [EP-tech] Alphabetically sort names with special characters
- From: Andrew M <eprints-tech@unitedgames.co.uk>
- Re: [EP-tech] Alphabetically sort names with special characters
- From: Andrew M <eprints-tech@unitedgames.co.uk>
- Re: [EP-tech] Alphabetically sort names with special characters
- From: Andrew M <eprints-tech@unitedgames.co.uk>
- [EP-tech] Alphabetically sort names with special characters
- Prev by Date: Re: [EP-tech] Alphabetically sort names with special characters
- Next by Date: Re: [EP-tech] Alphabetically sort names with special characters
- Previous by thread: Re: [EP-tech] Alphabetically sort names with special characters
- Next by thread: Re: [EP-tech] Alphabetically sort names with special characters
- Index(es):