Consequences of violating 'Blocking operation' assertion

lukas_vlcek · October 24, 2017, 12:01pm

In short: What are the risks of ignoring Blocking operation assert?

There are some checks in Elasticsearch code to verify operation is not executed in transport thread. For instance, in BaseFuture class. It requires asserts are enabled (which can happen when using ESIntegTestCase for example). Typically, the symptom looks like this: https://github.com/elastic/elasticsearch/issues/17865

I have seen some 3rd party plugins that can run into this when doing blocking calls in rest handler. For example, plugin uses client() to query cluster state or index documents right in the REST action class (i.e. in the context inheritting from BaseRestHandler class). Since this is "just" an assert there is nothing that forces plugin authors to solve this issue unless they want to implement integration tests (and they do not want to -da).

My understanding is that this assert tells you you are consuming resources from generic thread pool, which is unbound (at lesat for ES 2.x), which means that if you are running blocking operation in this context there is a risk of creating way too many threads and nothing can stop you except shortage of HW resources, which is what you really do not want to happen.

Is my understanding correct? Are there any other risks? And finally, why is this an assert and not an Exception?

lukas_vlcek · October 31, 2017, 6:49am

Anyone?

jasontedor · November 1, 2017, 1:23am

If you run a blocking operation on a networking thread, that networking thread is tied up until the blocking operation returns. It's bad to block these networking threads since they are needed to handle responses and requests. Even worse: if all the networking threads are tied up waiting on blocking calls to complete, and those blocking calls are waiting on responses other nodes, then there are no networking threads left to handle the responses, so the server is deadlocked.

This is an assert and not an exception because we want to catch this during development. Exceptions are too soft (for example, if we threw an exception and it only led to a shard being failed and being relocated elsewhere, the cluster could recover from this and tests might not fail). Assertions are hard since they go uncaught (or kill the node) and uncaught errors automatically fail tests.

system · November 29, 2017, 1:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
I'm getting a Blocking Operation related exception when running search using Java Elasticsearch	2	1224	July 14, 2017
Can anyone explain this exception Elasticsearch	7	668	July 6, 2017
Blocked Thread Problem Elasticsearch	6	854	July 6, 2017
Java API ping operation hangs the application on Elasticsearch 7.11.0 Elasticsearch	21	1529	March 17, 2021
Node Client with bulk request indefinitely blocked thread when ClusterBlockException is being thrown Elasticsearch	9	1004	July 6, 2017

Consequences of violating 'Blocking operation' assertion

Related topics