How does Elasticsearch use ThreadPool bulk queues in a cluster?


(Charles Lescot) #1

Hi,
i have a cluster of 2 nodes (version 1.7.2 running on java 1.8.0_45).
i have changed the thread pool bulk queue size from 50 to 250.
Under heavy bulk indexing load, i have seen via the cat API a "weird" behavior:
"bulk active" pool are both used, but only one queue_size has got a high value (116 for example)...
the other queue size is flat ( from 0 to 5 ....)
Is It a normal behavior?
Why Elasticsearch fill preferentially one queue ?

best regards,

Charles.


(Jörg Prante) #2

Why did you increase bulk queue from 50 to 250?

You can not expect a normal behavior if you tweak bulk thread pool. I assume you know what you do.

In the bulk requests, you have added bulk requests in a way that you exercise the respective node with a lot of work, so simple is that.


(Charles Lescot) #3

Hi,
i've bumped the thread pool to avoid bulk index rejection (although, i've read https://www.elastic.co/guide/en/elasticsearch/guide/current/_monitoring_individual_nodes.html#_threadpool_section which implies another strategy i will explore quickly).
i've checked via netstat that my code call the two nodes.
i don't understand why the load seems distributed across two nodes (the bulk active number via the cat api), but the increase of the queue is seen only on one node....
the other one is very low.

Charles.


(system) #4