Date_histogram returning duplicates in multi-cluster after upgrade

Hey all, we recently upgraded clusters from 8.10.2 to 8.13.2. Our cross-cluster search cluster is still running 7.17. We noticed we are receiving duplicate keys in buckets after the remote clusters upgraded. If we reduce the search to a single cluster, the duplicates go away.

All of the duplicates have a doc_count of 0

{
                "key_as_string" : "2024-04-26T00:00:00.000Z",
                "key" : 1714089600000,
                "doc_count" : 78
              },
              {
                "key_as_string" : "2024-04-26T00:00:00.000Z",
                "key" : 1714089600000,
                "doc_count" : 0
              },

The aggregation portion of the query is like this:

{
  "aggs": {
    "my_field": {
      "aggs": {
        "raw_histogram_data": {
          "date_histogram": {
            "extended_bounds": {
              "max": "2024-04-28T23:59:59",
              "min": "2024-04-22T00:00:01"
            },
            "field": "@timestamp",
            "calendar_interval": "1d",
            "time_zone": "UTC"
          }
        }
      },
      "terms": {
        "field": "my-field",
        "size": 10
      }
    }
  },

It would seem this change caused it Adjust Histogram's bucket accounting to be iteratively by martijnvg · Pull Request #102172 · elastic/elasticsearch · GitHub but I'm not sure how or if that's accurate.

Hello @Doc_Kaos,
Thank you very much for your investigation.

We started investigating this issue as soon as you submitted it and have identified the root cause from this PR
We have a verified fix, and are now working on the process to get this into a release. We have added this as a known issue in 8.13.x , and will have a KnowledgeBase article for our customers out soon (In addition to customer communication notifying them of the issue).

Thank you for bringing this to our attention, and thank you for using Elasticsearch!

1 Like

Awesome! Thank you!