Max_bucket and Kibana queries

Hi,

I have about 30 indices in ElasticSearch with about 10.000.000 documents and they are under one index pattern of Kibana.
One index of them ( data_A ) , which has about 10.000 distinct values in the field "c_display_name", has 6.000.000 docs in Elasticsearch.
I made a date histogram with the below request:

{
	"index": "**data***",
	"ignore_unavailable": true,
	"preference": 1592924167710
} {
	"aggs": {
		"2": {
			"date_histogram": {
				"field": "measurement_time",
				"interval": "3h",
				"time_zone": "Europe/Athens",
				"min_doc_count": 1
			},
			"aggs": {
				"3": {
					"terms": {
						"field": "**c_display_name**.keyword",
						**"size":** 10000,
						"order": {
							"_key": "asc"
						}
					},
					"aggs": {
						"1": {
							"avg": {
								"field": "c_value"
							}
						}
					}
				}
			}
		}
	},
	"size": 0,
	"_source": {
		"excludes": []
	},
	"stored_fields": ["*"],
	"script_fields": {},
	"docvalue_fields": [{
		"field": "date",
		"format": "date_time"
	}, {
		"field": "measurement_time",
		"format": "date_time"
	}],
	"query": {
		"bool": {
			"must": [{
				"range": {
					"measurement_time": {
						"format": "strict_date_optional_time",
						"gte": "2020-06-16T15:00:19.163Z",
						"lte": "2020-06-23T15:00:19.163Z"
					}
				}
			}],
			"filter": [{
				"match_all": {}
			}, {
				"match_all": {}
			}],
			"should": [],
			"must_not": []
		}
	},
	"timeout": "30000ms"
}

I also set **max_bucket to 350000** 
_cluster/settings
{
  "transient": {
    "search.max_buckets": 350000
  }
}
when I filter in order to see only the data_A  I get 
FAILED TO LOAD RENSPONSE DATA .

By changing the size of c_display_name and  max_bucket setting I see the below:

**c_display_name      ||   max_bucket**
      10000                               600000           ----->     Client request timeout
      1000000                          350000            ----->     "too_many_buckets_exception"
      1000                                 350000           ----->      get some buckets for data_A but also I get
                                                                                      a lot of "sum_other_doc_count": 259975
                                                                                                   "sum_other_doc_count": 260580, etc

How can I tune size of c_display_name, max_buckets or timeout in order to see all the c_display_name values in the response of the histogram. Or what else should I do?

Thank you in advance.Preformatted text

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.