Elasticsearch stop responding with too many scroll contexts

agonzalez · October 9, 2024, 9:23am

We have .NET app that use NEST to store events in elastic. Randomly after some time my elasticsearch cluster of 3 nodes stop working and i see these errors.

[indices:data/read/search[phase/query]]\nCaused by: org.elasticsearch.ElasticsearchException: Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.\n\tat

When this error happen the memory and cpu of cluster increase like x4 times.

java.base/java.lang.Thread.run(Thread.java:1623)\nCaused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<reduce_aggs>] would be [10211962019/9.5gb], which is larger than the limit of [10200547328/9.5gb], real usage: [10211961944/9.5gb], new bytes reserved: [75/75b], usages [inflight_requests=10553946/10mb, request=6373/6.2kb, fielddata=4464632454/4.1gb, eql_sequence=0/0b, model_inference=0/0b]\n\tat

Are this 2 errors related why is getting to context limit and not freeing context and everything get block and timeout? how to avoid these issues? what is the scrollsize recommendation use 100 or 1000 or 10000 to minimize this?

Quentin_Pradet · October 14, 2024, 10:59am

Hello, and sorry for not answering earlier. This is not an issue with the NEST client - having that many open scroll contexts is indeed going to hurt Elasticsearch.

Are this 2 errors related why is getting to context limit and not freeing context and everything get block and timeout?

The scroll search context can only be freed after the scroll timeout has elapsed, see Paginate search results | Elasticsearch Guide [8.15] | Elastic. This timeout is often set to 1m (one minute). If too many scroll contexts are kept open at the same time, your cluster can suffer.

While Elasticsearch will be able to reject new scroll contexts (first error), the memory pressure means that other requests can get rejected (second error). In that sense, the errors are related.

how to avoid these issues? what is the scrollsize recommendation use 100 or 1000 or 10000 to minimize this?

You could look into why your cluster is opening 500 hundred scroll contexts. That's a lot, do you know why you need that and why they all seem to be happening at the same time?

You could try to reduce the search.max_open_scroll_context limit if you'd like less impact on other requests.

Regarding the scroll size, having a larger amount could mean you get through the results quicker, and thus you keep around the scroll context for less time. But that could require a larger scroll timeout.

With that said, it would be best to switch to the Point in time API, which is more lightweight than scroll contexts: Point in time API | Elasticsearch Guide [8.15] | Elastic

Topic		Replies	Views
Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting Elasticsearch	6	27600	December 20, 2019
Elasticsearch 7.0.1 - Trying to create too many scroll contexts. Must be less than or equal to: [500] Elasticsearch	3	4498	June 15, 2019
Too Many Scroll Context ERROR but found no active scroll context Elasticsearch	2	1810	December 24, 2020
Too many scroll error Elasticsearch	4	623	December 31, 2020
Elasticsearch Elasticsearch	2	184	November 14, 2023

Elasticsearch stop responding with too many scroll contexts

Related topics