we are running an Elasticsearch cluster with 3 nodes.
sometimes a heavy agg query comes(run manually from Kibana dev tools) and one of the nodes becomes inaccessible.
So full cluster becomes inaccessible as ES takes some time to remove that node.
so we restart that node, so that the cluster is running.
Now, we have implemented circuit breaker in Elasticsearch with
indices.breaker.total.limit = 10GB
network.breaker.inflight_requests.limit = 10GB
Now, if i run same heavy agg query, elasticsearch becomes again un-responsive with Circuit breaker exception and other indexing/search request also start to through exception
Exception: [parent] Data too large, data for [<transport_request>] would be [10856018759/10.1gb], which is larger than the limit of [10737418240/10gb], usages [request=0/0b, fielddata=6552153321/6.1gb, in_flight_requests=723/723b.
Question:
If there a way to control memory usage for a single request?
Should be lower the fielddata/query cache limit ?
I'd recommend upgrading, 6.4 is far past EOL and there have been significant improvements since then.
That being said, your average heap usage is pretty high 27GB/30GB = ~90% usage. It is generally recommended that average heap usage should be <70%.
Without knowing much about your cluster (and also not knowing much about the 6.x release anymore). You might want to look at adding some addition nodes to spread load/increase total cluster heap. (But you should really look at upgrading)
i have pageviews data, indexes are made time-bases(monthly). we have visitorId, that is unique to each user*browser. and i have ~30 milliion doc monthly.
i have implemented circuit breaker.
then i execute following query:
GET pageviews_data_m*/_search
{"size":0,"timeout":"1s","terminate_after":100000,"query":{"match_all":{"boost":1.0}},"aggregations":{"suggestions":{"terms":{"field":"visitorId","size":10,"shard_size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"execution_hint":"map","order":[{"_count":"desc"},{"_key":"asc"}]}}}}
After running this query, circuit breaker exception accur, and all other requests (insert/search) also get same exception.
is there is a way to implement circuit breaker at request level?
Are you updating older data or are the monthly indices effectively read-only once the new index is created? If this is the case you may be able to lower heap usage by forcemerging old, read-only indices down to a single segment.
I do not think there is a lot you can do in 6.4, but have not used it in years. A lot has been improved with respect to circuit breakers and heap usage in newer versions though, so I would recommend you upgrade.
Making sure you do not have a lot of very small shards and forcemerging old indices down to a single segment can help reduce the amount of heap used. Making sure your mappings are optimised is also a good step to take.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.