Elasticsearch filter fails randomly when using multiple workers



I've just upgraded my elastic stack from 1.5 to 2.4 and experienced some problems with the filter elasticsearch.
After some tests, I finally realized that it was linked to the number of workers used. The problem only appears when using multiple workers (which is the default behaviour in 2.4).

The error is
error=>#<Elasticsearch::Transport::Transport::Error: Cannot get new connection from pool.

During my tests, I always used the same data to request my ES cluster and the errors appears randomly and are not linked to a special event/data

I tried to search the code of the plugin to closely isolate the cause and the problem seems to occur when the round-robin is selecting a connection in the array of hosts (function select of the RoundRobin class). So I tried to specify only one host but errors still happen.

Does someone experienced the same kind of error or have a clue on the origin of the problem to help me find a solution/forget about using multiple workers ?

Thanks for reading



For those having the same problem, I found a temporary solution.
I modified the elasticsearch client to make it use the "Random" selector to pick between ES hosts instead of the default "RoundRobin" selector.
I'm not familiar with Ruby but it seems like there is a problem with the class ::Elasticsearch::Transport::Transport::Connections::Selector::RoundRobin
I think the best improvement would be to correct this class

To make this change, just modify the following file:

#@client = ::Elasticsearch::Client.new(hosts: hosts, transport_options: transport_options)
@client = ::Elasticsearch::Client.new(hosts: hosts, transport_options: transport_options, selector_class: ::Elasticsearch::Transport::Transport::Connections::Selector::Random)

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.