Autocomplete with case sensitive aggregation and case insensitive ordering - help needed


(Tibor Stefan) #1

I've built an autocomplete functionality that works great if I don't care about correct ordering. If ordering matters I have a problem. I need to fill the AC field with authors name ordered ascending in case insensitive manner but author names must preserve inserted format (some of them starts with small letter or all letters are lowercase or uppercase) when displayed in the resulted "dropdown". Authors are indexed in separate index type without storing the source (source is in the main index type that holds the full article). Here are the extracts from settings. ES 5.4 used.

"analysis": {
...
"normalizer": {
"case_insensitive_sorting": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase"]
}
},

    "analyzer": {
      "autocomplete_analyzer": {
        "type": "custom",
        "tokenizer": "whitespace",
        "filter": [
          "lowercase",
          "autocomplete_filter"
        ]
      },
      ...

"mappings": {
...
"autocomplete": {
"_all": {
"enabled": false
},
"_source": {
"enabled": false
},
"properties": {
"author_ac": {
"type": "keyword",
"store": true,
"fields": {
"ac": {
"type": "text",
"analyzer": "autocomplete_analyzer",
"search_analyzer": "autocomplete_search",
"store": false
},
"sorted": {
"type": "keyword",
"normalizer": "case_insensitive_sorting",
"store": false
}
}
},
...

When using this query the authors name are original but the order is wrong.
{
"query": {
"match": {
"author_ac.ac": "ali"
}
},
"size": 0,
"aggs": {
"ac_author": {
"terms": {
"size": 100,
"field": "author_ac",
"order" : { "_term" : "asc" }
}
}
}
}
[

{
    "key": "ALINE SILVA",
    "doc_count": 1
}
,
{
    "key": "Adriana Alicia Ortega",
    "doc_count": 1
}
,
{
    "key": "Ahmed AlIbrahim",
    "doc_count": 1
}

]

When using the next query ordering is correct but the author names are not displayed correctly - all lowercase
{
"query": {
"match": {
"author_ac.ac": "ali"
}
},
"size": 0,
"aggs": {
"ac_author": {
"terms": {
"size": 100,
"field": "author_ac.sorted",
"order" : { "_term" : "asc" }
}
}
}
}
[

{
    "key": "adriana alicia ortega",
    "doc_count": 1
}
,
{
    "key": "ahmed alibrahim",
    "doc_count": 1
}
,
{
    "key": "aiman ali",
    "doc_count": 1
}

]

I tried sorting with sub terms aggregation to combine aggregation by author_ac and ordering by author_ac.sorted
(or wrote some other failed queries).
{
"query": {
"match": {
"author_ac.ac": "' . request('term') . '"
}
},
"size": 0,
"aggs": {
"ac_author": {
"terms": {
"size": 100,
"field": "author_ac",
"order" : { "sortingOrder" : "asc" }
}
,
"aggs": {
"sortingOrder": {
"terms": {
"size": 100,
"field": "author_ac.sorted",
"order" : { "_term" : "asc" }
}
}
}
}
}
}

Exception: Invalid terms aggregation order path [sortingOrder]. Terms buckets can only be sorted on a sub-aggregator path that is built out of zero or more single-bucket aggregations within the path and a final single-bucket or a metrics aggregation at the path end.

I need an advise or a correct query.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.