Date_histogram returning duplicates in multi-cluster after upgrade

Doc_Kaos · May 1, 2024, 3:10pm

Hey all, we recently upgraded clusters from 8.10.2 to 8.13.2. Our cross-cluster search cluster is still running 7.17. We noticed we are receiving duplicate keys in buckets after the remote clusters upgraded. If we reduce the search to a single cluster, the duplicates go away.

All of the duplicates have a doc_count of 0

{
                "key_as_string" : "2024-04-26T00:00:00.000Z",
                "key" : 1714089600000,
                "doc_count" : 78
              },
              {
                "key_as_string" : "2024-04-26T00:00:00.000Z",
                "key" : 1714089600000,
                "doc_count" : 0
              },

The aggregation portion of the query is like this:

{
  "aggs": {
    "my_field": {
      "aggs": {
        "raw_histogram_data": {
          "date_histogram": {
            "extended_bounds": {
              "max": "2024-04-28T23:59:59",
              "min": "2024-04-22T00:00:01"
            },
            "field": "@timestamp",
            "calendar_interval": "1d",
            "time_zone": "UTC"
          }
        }
      },
      "terms": {
        "field": "my-field",
        "size": 10
      }
    }
  },

It would seem this change caused it Adjust Histogram's bucket accounting to be iteratively by martijnvg · Pull Request #102172 · elastic/elasticsearch · GitHub but I'm not sure how or if that's accurate.

William_Chaparro · May 3, 2024, 1:15pm

Hello @Doc_Kaos,
Thank you very much for your investigation.

We started investigating this issue as soon as you submitted it and have identified the root cause from this PR
We have a verified fix, and are now working on the process to get this into a release. We have added this as a known issue in 8.13.x , and will have a KnowledgeBase article for our customers out soon (In addition to customer communication notifying them of the issue).

Thank you for bringing this to our attention, and thank you for using Elasticsearch!

Doc_Kaos · May 3, 2024, 1:29pm

Awesome! Thank you!

Topic		Replies	Views
Different values using date histogram Elasticsearch	1	193	March 24, 2023
Date histogram agg crashes cluster Elasticsearch	10	1375	July 5, 2017
Date_histogram buckets not as expected Elasticsearch	10	911	March 30, 2017
Date histogram aggregation issue for arrays fields Elasticsearch	2	275	February 26, 2022
Date histogram aggregation returns a wrong number of buckets and wrong start date of each bucket Elasticsearch	3	918	December 16, 2021

Date_histogram returning duplicates in multi-cluster after upgrade

Related topics