Elasticsearch threadpool and index settings in ECK

zohaib · January 15, 2021, 2:30pm

We are using ECK operator 1.2 and ElasticSearch 7.4.0 for a 3 node cluster with the default settings on Azure Kubernetes Services. We need to update the following ElasticSearch configuration in our cluster:
threadpool.bulk.type: fixed
threadpool.bulk.size: 24
threadpool.bulk.queue_size: 1000
threadpool.search.type: fixed
threadpool.search.size: 24
threadpool.search.queue_size: 5

we have tried adding it under nodeSets.config:

name: default
config:

most Elasticsearch configuration parameters are possible to set, e.g:

node.attr.attr_name: attr_value
node.master: true
node.data: true
node.ingest: true
node.ml: true

this allows ES to run on nodes even if their vm.max_map_count has not been increased, at a performance cost

node.store.allow_mmap: false
node.threadpool.bulk.type: fixed
node.threadpool.bulk.size: 24
node.threadpool.bulk.queue_size: 1000

node.threadpool.search.type: fixed
node.threadpool.search.size: 24
node.threadpool.search.queue_size: 50

but elastic instance gets stuck on ApplyingChanges and elastic pods start crashing after that with the following error:

"Suppressed: java.lang.IllegalArgumentException: unknown setting [node.threadpool.search.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings",

What's the best method to make these changes for ElasticSearch cluster deployed using ECK on Kubernetes?

Thanks in advance.

angelo · January 15, 2021, 8:08pm

Further in the stack trace you should find more details, that often can guide you to the correct setting to use, for example:
Suppressed: java.lang.IllegalArgumentException: unknown setting [node.threadpool.search.queue_size] did you mean [thread_pool.search.queue_size]?

For your noted settings and referencing the 7.4 docs, it should only require:

thread_pool.write.size: 24
thread_pool.write.queue_size: 1000
thread_pool.search.size: 24
thread_pool.search.queue_size: 50

Note that many of the sizes are calculated off of the "# of available processors" that the node detects - this may then also require you to adjust the processors setting as noted in the documentation. I don't think we would generally recommend changing these settings, especially increasing the write/bulk thread pool if you are already encountering issues with bulk rejections anyway so please be aware of that.

warkolm · January 15, 2021, 8:27pm

We definitely do not recommend increasing threadpool sizes, it just hides the underlying issue.

Any idea what these errors mean version 2.4.2 is an old but entirely relevant explanation as to why.

zohaib · January 15, 2021, 11:20pm

Got it. We are trying to find the root cause of the below errors:

invalid NEST response built from a unsuccessful () low level call on POST: /_bulk?refresh=wait_for # Invalid Bulk Items

---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.

What are your recommendations?

warkolm · January 17, 2021, 10:28pm

What do your Elasticsearch logs show at the time of this error?

janos · January 18, 2021, 9:53am

Hello Guys,

Thanks for your help. I will try to explain briefly what is happening so that you can react to whether our indexing strategy is the source of this issue causing overload of the Elastic indexing service or something else. The only strange thing is that it is working when we host the Elastic inside our Windows running \bin\elasticsearch.bat but inside POD is always failing.

So, the current situation is that we have a bunch of documents with metadata. Each metadata field is a document in the index with a relation field to the parent (document). When we process these documents parallel, the followings happen:

Document 1
     - Request 1:  Index Document Head
     - Request 2:  Index Document Metadata Fields (Bulk Request)
     - Request 3:  Index Document Text Extract

Document 2
     - Request 1:  Index Document Head
     - Request 2:  Index Document Metadata Fields (Bulk Request)
     - Request 3:  Index Document Text Extract

So, if we have 100 documents processing them in batches parallel, it can mean 100 / batch size requests (e.g. if batch size 4, then 25 requests) for writing the same index at the same time. Request 1, 2, 3 (which are different methods in the code as well) run sequentially awaiting each other so you can take them as ones.

The question is whether it can be the reason of this issue and if yes, then we should implement a different indexing strategy working with larger and more composite batches per a request-base?

As a second alternate solution, we could save documents and metadata to the persistent storage parallel but indexing would happen sequentially, in a dedicated thread taking available items from a queue continously because we want to make documents available for search immediately without having to wait for the last document to be saved.

Update
It seems that the team has managed to fix this issue. The backround job calling the microservice endpoint to save and index documents frequently times out.

system · February 15, 2021, 9:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Updating threadpool settings in ES Elasticsearch	2	420	July 6, 2017
ThreadPool Setting's for bulk indexing in elasticsearch.yml Elasticsearch	5	8747	July 5, 2017
How to increase thread pool Elasticsearch	6	5325	October 19, 2017
How to adjust thread pool settings? Elasticsearch	4	959	July 6, 2017
Setting thread_pool.index.size in es 5 Elasticsearch	3	5781	December 26, 2016

Elasticsearch threadpool and index settings in ECK

most Elasticsearch configuration parameters are possible to set, e.g:

this allows ES to run on nodes even if their vm.max_map_count has not been increased, at a performance cost

Related topics