Aggregation retuns null buckets

I have one query which does aggregation on day interval for specified date range (like last month). Data is sorted based on 'MyUsage' field to return top 5 'FldNameWise'.

Problem is it retuns null aggregations for some of the day interval although data is present. If I select only specific 'FldNameWise' data then that perticular day interval have values.

Index template and other things are correct only. Not sure why null values coming in aggregation.

Kindly
Sample query:

{
  "aggs": {
    "FldNameWise": {
      "terms": {
        "field": "FldName",
        "order": {
          "FldOrder": "desc"
        },
        "size": 5
      },
      "aggs": {
        "FldOrder": {
          "avg": {
            "field": "MyUsage"
          }
        },
        "DateTimeWise": {
          "date_histogram": {
            "field": "MyDateTime",
            "interval": "1d",
            "time_zone": "abc"
          },
          "aggs": {
            "MyUitl": {
              "avg": {
                "field": "MyUsage"
              }
            },
            "MyUitl2": {
              "avg": {
                "field": "MyUsage2"
              }
            }
          }
        }
      }
    }
  },
  "size": 0,
  "docvalue_fields": [
    {
      "field": "MyDateTime",
      "format": "date_time"
    }
  ],
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "MyDateTime": {
              "gte": 123,
              "lte": 456,
              "format": "epoch_millis"
            }
          }
        }
      ]
    }
  }
}

I am getting null values for 'MyUitl' aggregation in some day bucket even though data is present.

Hi, there anyone, plz help

If you have multiple nodes/shards then some of them may not have contributed results for certain "FldName" terms to the final analysis because they weren't seen as competitive locally (a low usage average).
This wizard walks you through the conditions under which this may happen, how any inaccuracies might be shown and what alternative approaches to try:

In this case it may prove to be more efficient to break your search into 2 requests - one to identify the names with the most usage and then another follow-up query to fill in their low-level details.

Trying to do it all in one request may be inefficient because the shard_size may need increasing to retrieve more terms for fully accurate usage stats and if shard_size is much greater than size a large number of results (and all their detailed date histograms) will be thrown away.

Though I will explore on breaking search requests into 2 request as suggested by you.

We have only single node and single shard.
Moreover, If I select only particular value in 'FldName' or remove "order": { "FldOrder": "desc" }, then I get data for all day intervals.

That should be OK then. Do you get a non-zero number for the doc_count_error_upper_bound?
What version of elasticsearch are you on?

Yes, I get "doc_count_error_upper_bound": -1 , Using elasticsearch 6.7.2 version

Out of interest, how many unique "FldName" values do you have? (The cardinality agg can help reveal this).
Also, how many total docs in the index and in the time range you query on?

I have around only 100 unique "FldName" values. I query on max 30 days datetime range.
that comprises around 30 indices as I have daily index. One index contains around 38400 documents, so for all 30 indices it might go upto 1152000 approax.

Ah - that explains my confusion - multiple indices means multiple shards which means the potential for this distributed counting problem (even though your content is not currently physically distributed across multiple nodes).

The reported error bound of -1 means the margin for error cannot be determined

"When the aggregation is either sorted by a sub aggregation or in order of ascending document count, the error in the document counts cannot be determined and is given a value of -1 to indicate this."

All of my original advice still stands.

That's not a huge number so you could try set shard_size to 100 to gather all the stats - your final result is only 5 so we'd throw away 95 date histograms which is less than ideal. It would be more efficient to do 2 requests but you might get away with 1 in terms of memory depending on the length of the date range and the size of date interval.

Thank you so much. setting shard_size to 100 works. But I will explore other approach which is more efficient as you said.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.