This has occurred on 8 of my data nodes randomly .
I still don't have any idea how to reproduce it or why it happens.
Once, before the data node completely fails to respond to the management requests, I used the cat thread_pool API and waited for a long time, then found that the management thread pool of this node has more than 190000 queued tasks to be executed, all other thread pools were acting normally.
LOGS (log level: info)
log of master node:
[2022-09-15T01:58:00,001][INFO ][o.e.x.m.MlDailyMaintenanceService] [master-node] triggering scheduled [ML] maintenance tasks
[2022-09-15T01:58:00,005][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [master-node] Deleting expired data
[2022-09-15T01:58:00,018][INFO ][o.e.x.m.j.r.UnusedStatsRemover] [master-node] Successfully deleted [0] unused stats documents
[2022-09-15T01:58:00,019][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [master-node] Completed deletion of expired ML data
[2022-09-15T01:58:00,019][INFO ][o.e.x.m.MlDailyMaintenanceService] [master-node] Successfully completed [ML] maintenance task: triggerDeleteExpiredDataTask
[2022-09-15T03:52:38,351][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:52:39,966][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [11.6s/11608ms] ago, timed out [1.6s/1601ms] ago, action [indices:monitor/recovery[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861890090]
[2022-09-15T03:52:48,362][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:52:51,709][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [13.4s/13409ms] ago, timed out [3.4s/3403ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861891006]
[2022-09-15T03:52:58,552][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:53:03,215][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [14.6s/14611ms] ago, timed out [4.6s/4603ms] ago, action [cluster:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861892078]
[2022-09-15T03:53:24,863][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [861894021] timed out after [15012ms]
[2022-09-15T03:53:24,873][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [861894067] timed out after [15012ms]
[2022-09-15T03:53:24,980][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [15.2s/15212ms] ago, timed out [200ms/200ms] ago, action [cluster:monitor/nodes/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861894021]
[2022-09-15T03:53:25,008][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [15.2s/15212ms] ago, timed out [200ms/200ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861894067]
[2022-09-15T03:53:38,349][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:53:43,511][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [15.2s/15212ms] ago, timed out [5.2s/5204ms] ago, action [indices:monitor/recovery[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861895820]
[2022-09-15T03:53:48,360][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:53:55,240][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [16.8s/16813ms] ago, timed out [6.8s/6805ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861896736]
[2022-09-15T03:53:58,413][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:54:04,998][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [16.6s/16613ms] ago, timed out [6.6s/6606ms] ago, action [cluster:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861897799]
[2022-09-15T03:54:09,892][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [861898356] timed out after [15012ms]
[2022-09-15T03:54:09,911][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [861898423] timed out after [15012ms]
[2022-09-15T03:54:11,126][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [16.2s/16214ms] ago, timed out [1.2s/1202ms] ago, action [cluster:monitor/nodes/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861898356]
[2022-09-15T03:54:11,171][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [16.4s/16414ms] ago, timed out [1.4s/1402ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861898423]
[2022-09-15T03:54:38,348][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T03:54:45,412][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [17.2s/17213ms] ago, timed out [7.2s/7206ms] ago, action [indices:monitor/recovery[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [861901548]
[2022-09-15T03:54:48,356][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
……
[2022-09-15T05:03:13,644][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862291126] timed out after [15011ms]
[2022-09-15T05:03:13,654][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862291151] timed out after [15011ms]
[2022-09-15T05:03:27,880][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [9.8m/589656ms] ago, timed out [9.6m/579647ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862237911]
[2022-09-15T05:03:38,383][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:03:40,321][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [9.8m/591857ms] ago, timed out [9.6m/581849ms] ago, action [cluster:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862238978]
[2022-09-15T05:03:48,394][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:03:52,902][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [9.9m/594857ms] ago, timed out [9.6m/579844ms] ago, action [cluster:monitor/nodes/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862239931]
[2022-09-15T05:03:52,939][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [9.9m/594857ms] ago, timed out [9.6m/579844ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862239956]
[2022-09-15T05:03:58,442][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:03:58,688][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862295206] timed out after [15010ms]
[2022-09-15T05:03:58,700][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862295249] timed out after [15010ms]
[2022-09-15T05:04:33,674][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10m/605261ms] ago, timed out [9.9m/595253ms] ago, action [indices:monitor/recovery[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862242643]
[2022-09-15T05:04:38,380][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:04:43,733][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862299514] timed out after [15012ms]
[2022-09-15T05:04:43,742][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862299551] timed out after [15012ms]
[2022-09-15T05:04:48,390][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:04:48,656][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.1m/610264ms] ago, timed out [10m/600256ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862243559]
[2022-09-15T05:04:53,516][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.1m/610463ms] ago, timed out [9.9m/595452ms] ago, action [cluster:monitor/nodes/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862244016]
[2022-09-15T05:04:53,529][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.1m/610463ms] ago, timed out [9.9m/595452ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862244054]
[2022-09-15T05:04:58,445][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:05:01,965][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.2m/613466ms] ago, timed out [10m/603458ms] ago, action [cluster:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862244737]
[2022-09-15T05:05:28,771][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862303821] timed out after [15011ms]
[2022-09-15T05:05:28,779][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862303851] timed out after [15011ms]
[2022-09-15T05:05:38,382][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:05:48,394][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:05:54,910][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.4m/626672ms] ago, timed out [10.1m/611661ms] ago, action [cluster:monitor/nodes/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862248240]
[2022-09-15T05:05:54,949][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.4m/626672ms] ago, timed out [10.1m/611661ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862248301]
[2022-09-15T05:05:55,218][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [10.4m/626872ms] ago, timed out [10.2m/616866ms] ago, action [indices:monitor/recovery[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862248401]
[2022-09-15T05:05:58,623][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
……
[2022-09-15T05:31:06,373][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862350586]
[2022-09-15T05:31:07,488][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862350705]
[2022-09-15T05:31:07,525][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [17.6m/1058309ms] ago, timed out [17.3m/1043298ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862350743]
[2022-09-15T05:31:22,312][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [17.7m/1063912ms] ago, timed out [17.5m/1053905ms] ago, action [indices:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862351614]
[2022-09-15T05:31:36,417][WARN ][o.e.t.TransportService ] [master-node] Received response for a request that has timed out, sent [17.7m/1067715ms] ago, timed out [17.6m/1057907ms] ago, action [cluster:monitor/stats[n]], node [{<datanode-ip>-hotData1}{TjAYjkLwSz6O64RgnuOTtQ}{jMDOUcAzQxadkHYcq4re8w}{<datanode-ip>}{<datanode-ip>:9301}{cdfhilrstw}{ml.machine_memory=404122529792, ml.max_open_jobs=512, box_type=hot, xpack.installed=true, ml.max_jvm_size=32212254720, transform.node=true}], id [862352685]
[2022-09-15T05:31:38,392][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:31:45,205][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862453089] timed out after [15010ms]
[2022-09-15T05:31:45,211][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862453109] timed out after [15010ms]
[2022-09-15T05:31:48,402][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:31:58,459][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:32:12,941][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862354981]
[2022-09-15T05:32:12,972][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862355043]
[2022-09-15T05:32:30,234][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862457363] timed out after [15011ms]
[2022-09-15T05:32:30,251][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862457426] timed out after [15011ms]
[2022-09-15T05:32:33,168][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862356342]
[2022-09-15T05:32:38,397][ERROR][o.e.x.m.c.i.IndexRecoveryCollector] [master-node] collector [index_recovery] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:32:48,408][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-node] collector [index-stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:32:49,718][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862357258]
[2022-09-15T05:32:58,460][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-node] collector [cluster_stats] timed out when collecting data: node [TjAYjkLwSz6O64RgnuOTtQ] did not respond within [10s]
[2022-09-15T05:33:03,197][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862358325]
[2022-09-15T05:33:15,284][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve stats for node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][cluster:monitor/nodes/stats[n]] request_id [862461759] timed out after [15010ms]
[2022-09-15T05:33:15,292][WARN ][o.e.c.InternalClusterInfoService] [master-node] failed to retrieve shard stats from node [TjAYjkLwSz6O64RgnuOTtQ]: [<datanode-ip>-hotData1][<datanode-ip>:9301][indices:monitor/stats[n]] request_id [862461785] timed out after [15010ms]
[2022-09-15T05:33:19,183][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862359370]
[2022-09-15T05:33:19,225][WARN ][o.e.t.TransportService ] [master-node] Transport response handler not found of id [862359403]
……