Variable Width Histogram not working with top hits subaggregation

vijay267 · November 20, 2024, 11:44pm

I'm trying to use variable width histogram to group results with similar relevance scores and then apply a secondary sort on them via a top hits aggregation. This is so that folks can sort on relevance and then also sort on a secondary field (today all the scores are too granular, so the secondary sort never comes into play).

Somehow however, when I do this, while the variable width aggregation creates nice bucket ranges for my scores, the top hits aggregation actually is returning results unrelated to the buckets on hand. It's almost as if they're randomly sorted into the wrong buckets. This exact same functionality works fine if the nested aggregation is a range aggregation or a normal histogram. Any idea what I might be doing wrong here? I'm using Elasticsearch 7.10.

"aggs": {
    "scores": {
    //   "histogram": { // This works fine if in place of the variable width
    //     "script": "_score",
    //     "interval": 2,
    //     "min_doc_count": 1
    //   },
    "variable_width_histogram": {
        "script": "_score",
        "buckets": 3
      },
      "aggs": {
        "top_hits_agg": {
          "top_hits": {
            "sort": [
              {
                 "numeric_kg_p_location*custom*70452*number*0": {
                   "order": "desc"
                 }
              }
            ],
            "_source": {
              "includes": [ 
                "_id", 
                "full_kg_p_location*business_name_en", 
                "numeric_kg_p_location*custom*70452*number*0"
                 ],
                 "excludes": ["vector*"]
            },
            "from": 0,
            "size": 5
          }
        }
      }
    }
  }

Thanks!

vijay267 · November 22, 2024, 8:18pm

I also was able to reproduce this exact issue on Elasticsearch 8.16.0.

vijay267 · December 5, 2024, 8:43pm

Hey sorry to ask again, but does anyone have any ideas on the above? Thanks.

Topic		Replies	Views
How variable width histogram with nested aggregations works Elasticsearch	3	264	April 21, 2023
Variable_width_histogram cannot be nested in sampler Elasticsearch aggregations	1	88	June 24, 2024
Getting error when using variable_width_histogram aggregation 'Too many buckets' Elasticsearch	1	144	December 6, 2023
Random top hits from each bucket Elasticsearch	1	372	September 26, 2019
Date_histogram and top_hits from unique values only Elasticsearch	1	563	September 26, 2019

Variable Width Histogram not working with top hits subaggregation

Related topics