Failed to execute watch - Timeout waiting for task

Brandon_Sollins · August 2, 2021, 7:40pm

I am noticing random watch timeouts on my cluster that is returning the following traceback:

[2021-07-31T05:08:11,951][DEBUG][o.e.x.w.e.ExecutionService] failed to execute watch [<INSERT RANDOM WATCHER HERE>]
org.elasticsearch.ElasticsearchTimeoutException: java.util.concurrent.TimeoutException: Timeout waiting for task.
	at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:78) ~[elasticsearch-7.10.1.jar:7.10.1]
	at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:61) ~[elasticsearch-7.10.1.jar:7.10.1]
	at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:55) ~[elasticsearch-7.10.1.jar:7.10.1]
	at org.elasticsearch.xpack.watcher.execution.ExecutionService.updateWatchStatus(ExecutionService.java:380) [x-pack-watcher-7.10.1.jar:7.10.1]
	at org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:321) [x-pack-watcher-7.10.1.jar:7.10.1]
	at org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$5(ExecutionService.java:420) [x-pack-watcher-7.10.1.jar:7.10.1]
	at org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:626) [x-pack-watcher-7.10.1.jar:7.10.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) [elasticsearch-7.10.1.jar:7.10.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: java.util.concurrent.TimeoutException: Timeout waiting for task.
	at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:243) ~[elasticsearch-7.10.1.jar:7.10.1]
	at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:65) ~[elasticsearch-7.10.1.jar:7.10.1]
	at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:76) ~[elasticsearch-7.10.1.jar:7.10.1]
	... 10 more

I think the key part is that its failing on:

org.elasticsearch.xpack.watcher.execution.ExecutionService.updateWatchStatus(ExecutionService.java:380) [x-pack-watcher-7.10.1.jar:7.10.1]

Which in the source code equates to:

client.update(updateRequest).actionGet(indexDefaultTimeout);

I assume it is trying to update the .watches document and is taking a long time, but I am not 100% sure. Has anyone seen this error before, and if so any way to resolve? The watchers are running on cool nodes, and the cluster is pretty big but load doesnt seem to get too high on these nodes.

Brandon_Sollins · August 16, 2021, 6:22pm

Just leaving this here in case it helps someone out in the future - this seems like it is most likely due to the nodes being too busy to execute in a timely manner. Throughout the day there is a fair amount of rollover occurring from the warm -> cool nodes, and after adjusting some of the rebalancing settings to be more conservative (specifically cluster.routing.allocation.cluster_concurrent_rebalance, cluster.routing.allocation.node_concurrent_recoveries, cluster.routing.allocation.node_initial_primaries_recoveries, and indices.recovery.max_bytes_per_sec), the failures appear to be occurring less frequently / not at all.

system · September 13, 2021, 6:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Watcher is throwing timeout_exception Kibana elastic-stack-monitoring , elastic-stack-alerting	3	1359	August 10, 2021
Watcher Service not working Elasticsearch	5	1513	August 21, 2018
Irregular timeout exceptions on watchers Kibana elastic-stack-alerting	2	503	January 17, 2022
Error deleting entry from .triggered_watches for watch Elasticsearch elastic-stack-alerting	2	576	April 27, 2021
Watcher execution error Elasticsearch elastic-stack-monitoring , elastic-stack-security , elastic-stack-alerting	1	1475	October 16, 2020

Failed to execute watch - Timeout waiting for task

Related topics