Hanging active search threads

Tibor_Schmidt · June 15, 2020, 7:05am

Hey,
we're having issues with timeouting search queries on our 61 node es cluster.

elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='XXXXXX', port=9200): Read timed out. (read timeout=300))

Debugging showed that some nodes have a high number of active (and rejected) search tasks.

$escurl -XGET "https://a6es-e.ng.seznam.cz:9200/_cat/thread_pool/*?v&h=node_name,name,active,rejected,completed" | sort -nk3    | awk '{if ($3 > 5) print $0}'
node_name       name                active rejected  completed
a6es-e8-es1     search                  23    87776 1965656928
a6es-e4-es0     search                  28    11525 2027376804
a6es-e3-es0     search                  32      364  122469486

I did some digging and changed refresh_interval on indices from 1s to 60s - this helped a lot (since we have quite heavy writes) and the issue basically disappeared for a month.

Sadly last week, the issue started creeping on us again and I don't think additional increase of refresh_interval will help. I don't think that writing load on the cluster changed in the last two weeks. I've added few automatic cron searches against the cluster, but it should be negligible compared to the rest of manual and automated queries.

Cluster admins gave me access to logs on one of currently affected nodes, but without knowing what should I look for, it's a needle in a haystack.

Is there a way to figure out what those threads do (queries or at least indices they are "working on")?

Anything that would kick me in the right direction to solve this problem would be appreciated.

Thanks.

system · July 13, 2020, 7:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search thread pool active and queue not full but rejected happened Elasticsearch	8	851	September 13, 2021
Active threads in Thread pool is not changing Elasticsearch	1	363	August 19, 2019
Search thread pools not released Elasticsearch	8	1211	July 5, 2017
Why does the search active threads equal 0? Elasticsearch	1	445	October 25, 2018
Running into Elasticsearch high search latency 5-10s issue in production Elasticsearch	13	3312	July 5, 2017

Hanging active search threads

Related topics