[Ann] ICU facet allows sorting based on ICU collations

Hi,

just a quick note, I want to draw your attention to a little feature I just
added to the ICU plugin. A new facet type "icu" allows sorting string
entries in facets according to ICU collation rules.

For german users, the challenge is well-known in displaying names in facets
that are sorted appropriately even if they have umlauts
("Telefonbuchsortierung").

Pull request here

Preliminary ZIP as drop-in replacement for the ICU plugin

https://github.com/downloads/jprante/elasticsearch-analysis-icu/elasticsearch-analysis-icu-1.8.0-SNAPSHOT.zip
Cheers, Jörg

--

Sounds interesting for french as well, thanks Jorg

-- Tanguy

Le mercredi 17 octobre 2012 11:04:57 UTC+2, Jörg Prante a écrit :

Hi,

just a quick note, I want to draw your attention to a little feature I
just added to the ICU plugin. A new facet type "icu" allows sorting string
entries in facets according to ICU collation rules.

For german users, the challenge is well-known in displaying names in
facets that are sorted appropriately even if they have umlauts
("Telefonbuchsortierung").

Pull request here
Adding ICU collation based sorting for facets by jprante · Pull Request #7 · elastic/elasticsearch-analysis-icu · GitHub

Preliminary ZIP as drop-in replacement for the ICU plugin

https://github.com/downloads/jprante/elasticsearch-analysis-icu/elasticsearch-analysis-icu-1.8.0-SNAPSHOT.zip
Cheers, Jörg

--

Well, I had to rework the pull request - it wasn't working, but now it
should. Sorry for the inconvenience.

Also I updated the download zip.

For users who prefer the non-ICU environment (JDK-based)
for java.text.Collator based sorting on doc lists and facets, I issued pull
requests:

Cheers, Jörg

On Wednesday, October 17, 2012 11:19:28 AM UTC+2, Tanguy wrote:

Sounds interesting for french as well, thanks Jorg

-- Tanguy

Le mercredi 17 octobre 2012 11:04:57 UTC+2, Jörg Prante a écrit :

Hi,

just a quick note, I want to draw your attention to a little feature I
just added to the ICU plugin. A new facet type "icu" allows sorting string
entries in facets according to ICU collation rules.

For german users, the challenge is well-known in displaying names in
facets that are sorted appropriately even if they have umlauts
("Telefonbuchsortierung").

Pull request here
Adding ICU collation based sorting for facets by jprante · Pull Request #7 · elastic/elasticsearch-analysis-icu · GitHub

Preliminary ZIP as drop-in replacement for the ICU plugin

https://github.com/downloads/jprante/elasticsearch-analysis-icu/elasticsearch-analysis-icu-1.8.0-SNAPSHOT.zip
Cheers, Jörg

--

Great work Jorg!. We are going to go through the effort of upgrading to Lucene 4.0, and then restructure a bit the facet API code (to simplify writing custom ones for example). I would also say that, if its just for ordering, we should allow for a pluggable "sort" logic for the different terms facet, which ICU can add.

On Oct 19, 2012, at 9:49 PM, Jörg Prante joergprante@gmail.com wrote:

Well, I had to rework the pull request - it wasn't working, but now it should. Sorry for the inconvenience.

Also I updated the download zip.

For users who prefer the non-ICU environment (JDK-based) for java.text.Collator based sorting on doc lists and facets, I issued pull requests:

Adding collator-based sorting for terms facet entries by jprante · Pull Request #2337 · elastic/elasticsearch · GitHub
Adding Collation Key Analyzer by jprante · Pull Request #2338 · elastic/elasticsearch · GitHub

Cheers, Jörg

On Wednesday, October 17, 2012 11:19:28 AM UTC+2, Tanguy wrote:
Sounds interesting for french as well, thanks Jorg

-- Tanguy

Le mercredi 17 octobre 2012 11:04:57 UTC+2, Jörg Prante a écrit :
Hi,

just a quick note, I want to draw your attention to a little feature I just added to the ICU plugin. A new facet type "icu" allows sorting string entries in facets according to ICU collation rules.

For german users, the challenge is well-known in displaying names in facets that are sorted appropriately even if they have umlauts ("Telefonbuchsortierung").

Pull request here Adding ICU collation based sorting for facets by jprante · Pull Request #7 · elastic/elasticsearch-analysis-icu · GitHub

Preliminary ZIP as drop-in replacement for the ICU plugin
https://github.com/downloads/jprante/elasticsearch-analysis-icu/elasticsearch-analysis-icu-1.8.0-SNAPSHOT.zip

Cheers, Jörg

--

--