[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Faceted Search (EPrints goes ElasticSearch)



CAUTION: This e-mail originated outside the University of Southampton.
Hi,

Is there any progress about this project? Still wait until now, I hope this projects will be released soon.

thank you,

Regards,
Agung PW

On Mon, Jan 25, 2021 at 9:23 PM <jens.witzel at uzh.ch<mailto:jens.witzel at uzh.ch>> wrote:

Dear Ajunk (Angung), dear Tech-Group

thanks a lot! You're invited to look & feel, test and provide us a short feedback on the feedback form available on https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.uzh.ch%2Ffs.html&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482059347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9bbdnip6RI9u2c1GKc34TNBikddzH2uh4Y6vYX%2FiOW0%3D&amp;reserved=0

As Martin wrote, we're in beta-times. So it will need some time to fix a couple of issues. We're going to publish our work on git later this year. Please notice: we provide the code with lots of comments, but as-is, and without any support service on our part.

Let us tell you some additional words about how it work's behind the curtain:

In a nutshell, we use 3 components
- a new indexing process (4000 lines of code), client (trigger driven) and admin (full, partial) to set up and build the new index on our ES infrastructure.
- a proxy (cgi/plugin, 600 lines of code) hosted on our eprint repo to manage the credentials and add some repo stuff like phrases, language behaviour etc.
- the react GUI, taken from the elastic git project and customized for our requirements.

It was a hard way to do all the field mappings from eprints to elastic and to handle all the little user needs around. But all in all we're very happy to offer a modern and very fast search tool including facets, autosuggestion, highlighting and snippets. Did i mention, that's it's pretty fast on 150.000 publications? ;-)

Kind regards,

ZORA-IT (Martin Br?ndle, Jens Witzel)

--
Jens Witzel
Zentrale Informatik
Universit?t Z?rich
Stampfenbachstrasse 73
CH-8006 Z?rich

mail:  jens.witzel at uzh.ch<mailto:jens.witzel at uzh.ch>
phone: +41 44 63 56777
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.zi.uzh.ch%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482059347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=dYS4wZVcXB396nXPyYCqKIWdjIYz3HYEHwryFPBws%2Bo%3D&amp;reserved=0

[Inactive hide details for "Ajunk Pracetio via Eprints-tech" ---25.01.2021 14:07:29---CAUTION: This e-mail originated outside th]"Ajunk Pracetio via Eprints-tech" ---25.01.2021 14:07:29---CAUTION: This e-mail originated outside the University of Southampton. Hi,

Von: "Ajunk Pracetio via Eprints-tech" <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>>
An: "EDER Norbert via Eprints-tech" <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>>, martin.braendle at uzh.ch<mailto:martin.braendle at uzh.ch>
Datum: 25.01.2021 14:07
Betreff: Re: [EP-tech] Faceted Search (EPrints goes ElasticSearch)
Gesendet von: <eprints-tech-bounces at ecs.soton.ac.uk<mailto:eprints-tech-bounces at ecs.soton.ac.uk>>

________________________________



CAUTION: This e-mail originated outside the University of Southampton.
Hi,

Wow..that cool. I can not wait to see this code on eprintsug github.

Thank you

Regards,
Agung Prasetyo W.

On Mon, Jan 25, 2021, 19:50 Martin Braendle via Eprints-tech <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>> wrote:

CAUTION: This e-mail originated outside the University of Southampton.
Dear ep-tech members,

we have implemented faceted search on our EPrints repository ZORA at University of Zurich, using ElasticSearch.

A public beta is now available for trying out and testing on

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.uzh.ch%2Ffs.html&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482059347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9bbdnip6RI9u2c1GKc34TNBikddzH2uh4Y6vYX%2FiOW0%3D&amp;reserved=0<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.uzh.ch%2Ffs.html&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482059347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9bbdnip6RI9u2c1GKc34TNBikddzH2uh4Y6vYX%2FiOW0%3D&amp;reserved=0>

Please provide us feedback on the feedback form available on this page.
All documents of ZORA (about 150'000) are available for searching.

We decided to use ElasticSearch because of its performant and fail-save infrastructure. It provides many functions that can be expected from
a modern search engine:

Facets: filters on selected criteria such as publication year, document type, open access status, journal, affiliations and many more
Autosuggest and -complete during typing
Context-dependent result snippets of abstract and fulltext
Hit highlighting
Responsive GUI
and many more things

Code will be made available later this year on Github (eprintsug).

Kind regards,

ZORA-IT (Martin Br?ndle, Jens Witzel)
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482059347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VxXyZSnC3%2F7g4pURx17kgM3e%2FPZKfkBS%2FVBDOam%2FKmI%3D&amp;reserved=0<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482059347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VxXyZSnC3%2F7g4pURx17kgM3e%2FPZKfkBS%2FVBDOam%2FKmI%3D&amp;reserved=0>
*** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482069309%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ZcZ9%2BNHoNXnoIHxvM8XKWiqL3%2FMbW9udkCvMqAjN6Gg%3D&amp;reserved=0<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482069309%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ZcZ9%2BNHoNXnoIHxvM8XKWiqL3%2FMbW9udkCvMqAjN6Gg%3D&amp;reserved=0>*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482069309%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8e7J73R8I4MwpKLMZNXx67QnfhEXIoiJ8zgfkhWzfoI%3D&amp;reserved=0
*** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C46418e9b96114a5933f608d8f0d1a3d0%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637524130482069309%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ZcZ9%2BNHoNXnoIHxvM8XKWiqL3%2FMbW9udkCvMqAjN6Gg%3D&amp;reserved=0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20210327/b75b009e/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20210327/b75b009e/attachment-0001.gif