No answers for bulk requests

Hi, as I wrote in name of topic I have troubles with bulk requests (also with Follower Checker and Leader Checker messages).
Usual bulk trace looks like:

[2021-01-13T08:30:44,163][TRACE][o.e.h.HttpTracer         ] [h1-es03] [3104064][null][POST][/_bulk?timeout=1m] received request from [Netty4HttpChannel{localAddress=/h1-es03ip:9200, remoteAddress=/clientip:51520}]
[2021-01-13T08:30:44,266][TRACE][o.e.h.HttpTracer         ] [h1-es03] [3104063][null][OK][application/json; charset=UTF-8][23437] sent response to [Netty4HttpChannel{localAddress=/h1-es03ip:9200, remoteAddress=/clientip:35924}
] success [true]

But in times of high load of cluster may be that answer for bulk not created and not sended

[2021-01-13T08:30:45,244][TRACE][o.e.h.HttpTracer         ] [h1-es03] [3104069][null][POST][/_bulk?timeout=1m] received request from [Netty4HttpChannel{localAddress=/h1-es03ip:9200, remoteAddress=/clientip:35924}]
and no response

Dump opened in Wireshark

Troubles within check messages, es-01 master node log::


[2021-01-11T11:36:44,623][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:44,625][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:45,274][TRACE][o.e.c.c.LeaderChecker    ] [h1-es01] handling LeaderCheckRequest{sender={h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}}
[2021-01-11T11:36:45,541][TRACE][o.e.c.c.LeaderChecker    ] [h1-es01] handling LeaderCheckRequest{sender={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}}
[2021-01-11T11:36:45,616][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:45,617][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:45,625][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:45,626][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:46,543][TRACE][o.e.c.c.LeaderChecker    ] [h1-es01] handling LeaderCheckRequest{sender={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}}
[2021-01-11T11:36:46,617][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:46,619][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:46,627][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:46,630][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:47,546][TRACE][o.e.c.c.LeaderChecker    ] [h1-es01] handling LeaderCheckRequest{sender={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}}
[2021-01-11T11:36:47,620][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:47,622][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:47,630][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] handleWakeUp: checking {h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr} with FollowerCheckRequest{term=129, sender=
{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}}
[2021-01-11T11:36:47,633][TRACE][o.e.c.c.FollowersChecker ] [h1-es01] FollowerChecker{discoveryNode={h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}, failureCountSinceLastSuccess=0, [cl
uster.fault_detection.follower_check.retry_count]=3} check successful
[2021-01-11T11:36:48,220][TRACE][o.e.c.NodeConnectionsService] [h1-es01] connectDisconnectedTargets: {{h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}=ConnectionTarget{discoveryNode={h1
-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}, activityType=IDLE}, {h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}=ConnectionTarget{
discoveryNode={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}, activityType=IDLE}, {h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}=
ConnectionTarget{discoveryNode={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}, activityType=IDLE}}
[2021-01-11T11:36:48,548][TRACE][o.e.c.c.LeaderChecker    ] [h1-es01] handling LeaderCheckRequest{sender={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}}

es03 follwer logs:

[2021-01-11T11:36:51,640][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:52,642][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:53,644][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:54,646][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:55,274][DEBUG][o.e.c.c.LeaderChecker    ] [h1-es03] 1 consecutive failures (limit [cluster.fault_detection.leader_check.retry_count] is 3) with leader [{h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{192.168.5
7.101}{<es01_ip>:9300}{dimr}]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [h1-es01][<es01_ip>:9300][internal:coordination/fault_detection/leader_check] request_id [117011842] timed out after [10006ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1074) [elasticsearch-7.9.1.jar:7.9.1]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651) [elasticsearch-7.9.1.jar:7.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
[2021-01-11T11:36:55,276][TRACE][o.e.c.c.LeaderChecker    ] [h1-es03] scheduling next check of {h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr} for [cluster.fault_detection.leader_check
.interval] = 1s
[2021-01-11T11:36:55,648][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:56,276][TRACE][o.e.c.c.LeaderChecker    ] [h1-es03] checking {h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr} with [cluster.fault_detection.leader_check.timeout] = 10s
[2021-01-11T11:36:56,650][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:57,267][TRACE][o.e.c.NodeConnectionsService] [h1-es03] connectDisconnectedTargets: {{h1-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}=ConnectionTarget{discoveryNode={h1
-es03}{Qshtg7-TQIyxeiccpkmlIA}{irRkTH7XQt63qEIGc-SWjA}{<es03_ip>}{<es03_ip>:9300}{dimr}, activityType=IDLE}, {h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}=ConnectionTarget{
discoveryNode={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}, activityType=IDLE}, {h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}=
ConnectionTarget{discoveryNode={h1-es02}{qgmMV2UbT-ScN9uRr6YM8g}{1Oc8tIBFR428oBjV_uLjHw}{<es02_ip>}{<es02_ip>:9300}{dimr}, activityType=IDLE}}
[2021-01-11T11:36:57,652][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:58,654][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:36:59,656][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:37:00,657][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path
[2021-01-11T11:37:01,659][TRACE][o.e.c.c.FollowersChecker ] [h1-es03] responding to FollowerCheckRequest{term=129, sender={h1-es01}{MT3BSgtaQBWux8BJDBSsHg}{5OJxyuZrR7iXpeL4OqyDiQ}{<es01_ip>}{<es01_ip>:9300}{dimr}} on fast path

Why it may be so?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.