Terms aggregation - Sort on the relevancy of the terms

s_mp · May 12, 2017, 9:59pm

Hi,
We have this usecase, where we are doing terms aggregation on a multi-valued (array) field. The documents are initially filtered using a regex match on this field, and the terms aggregation is done on the same field.
The terms aggregation is returning the terms sorted by doc_count as expected.
We would like the sorting to be based on the relevancy of the term with respect to the initial regex filter. Is this possible ?

Refer to my example below for more clarity.

PUT someindex
{
    "settings": {
        "index": {
            "number_of_replicas": 0,
            "number_of_shards": 1,
            "search.slowlog.threshold.query.debug": "1ms"
        }
    },
    "mappings": {
        "my_type": {
            "dynamic_templates": [
                {
                    "strings": {
                        "mapping": {
                            "type": "keyword"
                        },
                        "match_mapping_type": "string"
                    }
                }
            ]
        }
    }
}
PUT someindex/my_type/1
{
  "title": "document 001",
  "tags":  ["lucene", "search" ]
}

PUT someindex/my_type/2
{
  "title": "document 002",
  "tags":  [ "lucene is a search library", "lucene", "elastic", "search" ]
}

GET someindex/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "regexp": {
            "tags": {
              "value": ".*lucene.*|.*search.*"
            }
          }
        }
      ]
    }
  },
  "aggregations": {
    "requestedDimension": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

Response :

{
  "took": 9,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "requestedDimension": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "lucene",
          "doc_count": 2
        },
        {
          "key": "search",
          "doc_count": 2
        },
        {
          "key": "elastic",
          "doc_count": 1
        },
        {
          "key": "lucene is a search library",
          "doc_count": 1
        }
      ]
    }
  }
}

Looking for a query to get the below response, as "lucene is a search library" is more relevant to my filter.

{
  "took": 9,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "requestedDimension": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "lucene is a search library",
          "doc_count": 1
        },
        {
          "key": "lucene",
          "doc_count": 2
        },
        {
          "key": "search",
          "doc_count": 2
        },
        {
          "key": "elastic",
          "doc_count": 1
        }
      ]
    }
  }
}

lwintergerst · May 18, 2017, 8:42pm

Hello Srini,
unfortunately that is not possible as far as I know. I will double check with my colleagues and get back to you.

Scoring is only done as part of they query phase and can't be applied to aggregations.

Thank you for providing the exact query and aggregation. This makes it much easier for us to help you.

system · June 15, 2017, 8:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sort Terms Aggregations on Keyword field Elasticsearch	2	374	January 14, 2019
Elasticsearch 7.15 Multi terms aggregation with several fields, sort by key Elasticsearch	1	516	December 22, 2022
Filter by a regexp inside a terms aggregation Elasticsearch	3	573	May 9, 2020
Sorting aggregation buckets on string field Elasticsearch	7	6756	July 6, 2017
Ordering terms in term aggregation in the new Java API Client Elasticsearch	4	276	June 3, 2024

Terms aggregation - Sort on the relevancy of the terms

Related topics