Elasticsearch Ingest node restart with "java.lang.OutOfMemoryError: Java heap space"

Elastic Version: 6.5.4

ES cluster has 16 ingest nodes with -Xms8g -Xmx8g. We have around 120 fluentd pushing the data to ES cluster.

The ingest nodes are getting restarted with java.lang.OutOfMemoryError: Java heap space

{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:06:05.757Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][1244670] overhead, spent [1.9m] collecting in the last [1.9m]"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:06:37.985Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][old][1244671][21569] duration [13s], collections [1]/[13s], total [13s]/[1h], memory [7.9gb]->[7.9gb]/[7.9gb], all_pools {[young] [133.1mb]->[133.1mb]/[133.1mb]}{[survivor] [16.5mb]->[16.5mb]/[16.6mb]}{[old] [7.8gb]->[7.8gb]/[7.8gb]}"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:07:10.053Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][1244671] overhead, spent [13s] collecting in the last [13s]"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:08:14.418Z","logger":"o.e.d.z.ZenDiscovery","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again"}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid12.hprof ...
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:08:40.213Z","logger":"o.e.d.z.UnicastZenPing","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"failed to send ping to [{apaas-belk-elasticsearch-master-59bf85b856-6pmb8}{CGnn671kS9i2LGUqn7S38g}{Kl7y_1B7SS68y6JfWDsAaQ}{172.16.146.72}{172.16.146.72:9300}]"}
org.elasticsearch.transport.ReceiveTimeoutTransportException: [apaas-belk-elasticsearch-master-59bf85b856-6pmb8][172.16.146.72:9300][internal:discovery/zen/unicast] request_id [1247437] timed out after [122227ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1038) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:11:09.329Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"Transport response handler not found of id [799190]"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:11:22.288Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][old][1244672][21576] duration [1.9m], collections [7]/[1.6m], total [1.9m]/[1h], memory [7.9gb]->[7.9gb]/[7.9gb], all_pools {[young] [133.1mb]->[133.1mb]/[133.1mb]}{[survivor] [16.5mb]->[16.6mb]/[16.6mb]}{[old] [7.8gb]->[7.8gb]/[7.8gb]}"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:11:22.288Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][1244672] overhead, spent [1.9m] collecting in the last [1.6m]"}
Heap dump file created [10860628498 bytes in 102.668 secs]
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:18:08.456Z","logger":"o.e.d.z.UnicastZenPing","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"failed to send ping to [{apaas-belk-elasticsearch-master-59bf85b856-6pmb8}{CGnn671kS9i2LGUqn7S38g}{Kl7y_1B7SS68y6JfWDsAaQ}{172.16.146.72}{172.16.146.72:9300}]"}
org.elasticsearch.transport.ReceiveTimeoutTransportException: [apaas-belk-elasticsearch-master-59bf85b856-6pmb8][172.16.146.72:9300][internal:discovery/zen/unicast] request_id [1247439] timed out after [348447ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1038) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:20:56.705Z","logger":"o.e.d.z.ZenDiscovery","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"not enough master nodes discovered during pinging (found [[]], but needed [1]), pinging again"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:17:16.939Z","logger":"o.e.d.z.UnicastZenPing","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"failed to send ping to [{apaas-belk-elasticsearch-master-59bf85b856-6pmb8}{CGnn671kS9i2LGUqn7S38g}{Kl7y_1B7SS68y6JfWDsAaQ}{172.16.146.72}{172.16.146.72:9300}]"}
org.elasticsearch.transport.ReceiveTimeoutTransportException: [apaas-belk-elasticsearch-master-59bf85b856-6pmb8][172.16.146.72:9300][internal:discovery/zen/unicast] request_id [1247438] timed out after [484549ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1038) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:21:16.729Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][old][1244673][21613] duration [8.9m], collections [37]/[9.1m], total [8.9m]/[1.2h], memory [7.9gb]->[7.9gb]/[7.9gb], all_pools {[young] [133.1mb]->[133.1mb]/[133.1mb]}{[survivor] [16.6mb]->[16.6mb]/[16.6mb]}{[old] [7.8gb]->[7.8gb]/[7.8gb]}"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:21:16.730Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][1244673] overhead, spent [8.9m] collecting in the last [9.1m]"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:22:47.280Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][old][1244674][21632] duration [4.6m], collections [19]/[5.7m], total [4.6m]/[1.3h], memory [7.9gb]->[7.9gb]/[7.9gb], all_pools {[young] [133.1mb]->[133.1mb]/[133.1mb]}{[survivor] [16.6mb]->[16.6mb]/[16.6mb]}{[old] [7.8gb]->[7.8gb]/[7.8gb]}"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:23:58.034Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][1244674] overhead, spent [4.6m] collecting in the last [5.7m]"}

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "elasticsearch[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8][[unicast_connect]][T#39211]"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "elasticsearch[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8][management][T#7]"
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T03:15:26.980Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][old][1244676][21854] duration [52.3m], collections [222]/[52m], total [52.3m]/[2.1h], memory [7.9gb]->[7.9gb]/[7.9gb], all_pools {[young] [133.1mb]->[133.1mb]/[133.1mb]}{[survivor] [16.6mb]->[16.6mb]/[16.6mb]}{[old] [7.8gb]->[7.8gb]/[7.8gb]}"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T03:16:43.966Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"Transport response handler not found of id [799215]"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T03:18:16.975Z","logger":"o.e.m.j.JvmGcMonitorService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"[gc][1244676] overhead, spent [52.3m] collecting in the last [52m]"}
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T02:56:10.911Z","logger":"o.e.d.z.UnicastZenPing","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"failed to resolve host [apaas-belk-elasticsearch-discovery]"}
java.lang.OutOfMemoryError: Java heap space

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "elasticsearch[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8][[unicast_connect]][T#39209]"
{"type":"log","host":"apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8","level":"WARN","systemid":"e7c84e34d38a49e1ae639a5dab455af5","system":"BELK","time": "2021-02-18T03:19:57.055Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8] ","log":"Transport response handler not found of id [799255]"}

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "elasticsearch[apaas-belk-elasticsearch-client-75b6f4d58d-ncwr8][[unicast_connect]][T#39212]"

Why "java.lang.OutOfMemoryError: Java heap space" is seen instead of circuit breaker exception and node should not restart.

What is the output of:

GET /
GET /_cat/nodes?v
GET /_cat/health?v
GET /_cat/indices?v

If some outputs are too big, please share them on gist.github.com and link them here.

That's a way too old. Many things have been improved since then. At least upgrade to 6.8 but better to switch to 7.11!

Yes I understand the upgrade recommendation. But I want to understand the circuit breaker exception available in 6.5.4 release too. Why it has not triggered.

My Cluster is in yellow state. Hope this should not cause a ingest node to run out of heap memory.

Please find below requested stats.

_cat/nodes?v

ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.16.63.176            59          56  15    1.45    1.80     1.95 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-hjmh9
172.16.60.150             2          52  11    4.81    4.36     4.35 mi        *      apaas-belk-elasticsearch-master-6b66589c76-rlfjd
172.16.215.43             9          58  13    1.69    1.86     2.20 di        -      apaas-belk-elasticsearch-data-3
172.16.203.249           73          52  14    1.85    2.29     2.56 di        -      apaas-belk-elasticsearch-data-0
172.16.13.114            17          55  14    1.70    1.78     1.98 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-2k8j4
172.16.194.10            10          53  15    2.13    2.19     2.34 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-zpf7r
172.16.119.37             9          51  22    2.67    3.23     3.60 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-lxkv5
172.16.103.185           13          53  16    1.67    2.19     2.37 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-9bjbb
172.16.85.210             3          35  13    1.26    1.72     1.90 mi        -      apaas-belk-elasticsearch-master-6b66589c76-prdc7
172.16.111.245           16          59  16    2.19    2.50     2.66 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-vfskc
172.16.24.20             68          90  13    1.90    1.94     1.87 di        -      apaas-belk-elasticsearch-data-2
172.16.38.90              2          67  17    2.02    2.35     2.42 mi        -      apaas-belk-elasticsearch-master-6b66589c76-v6wdh
172.16.206.10            19          50  15    2.43    2.51     2.62 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-9drcf
172.16.216.47            69          72  23    2.13    2.66     2.97 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-8sr6v
172.16.38.68             17          67  17    2.02    2.35     2.42 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-88k2h
172.16.24.28             34          90  13    1.90    1.94     1.87 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-kxlh7
172.16.244.127           43          66  16    2.92    4.29     4.97 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-s7fz8
172.16.38.82             47          67  14    2.02    2.35     2.42 di        -      apaas-belk-elasticsearch-data-4
172.16.185.227           71          51  19    2.47    2.68     2.95 i         -      apaas-belk-elasticsearch-client-75b6f4d58d-jtpd8
172.16.224.113           50          69  11    1.99    1.83     1.81 di        -      apaas-belk-elasticsearch-data-1

cluster/health

{
   "cluster_name":"default-apaas",
   "status":"yellow",
   "timed_out":false,
   "number_of_nodes":20,
   "number_of_data_nodes":5,
   "active_primary_shards":291,
   "active_shards":291,
   "relocating_shards":0,
   "initializing_shards":0,
   "unassigned_shards":291,
   "delayed_unassigned_shards":0,
   "number_of_pending_tasks":0,
   "number_of_in_flight_fetch":0,
   "task_max_waiting_in_queue_millis":0,
   "active_shards_percent_as_number":50.0
}

_cat/indices

yellow open rook-ceph-legacy-2021.03.01        EmONWKuNQFy5gjmw1rIF7g 5 1    124396 0   51.2mb   51.2mb
yellow open default-log-2021.02.23             CFg63ItsQSeD73f9wu1HkQ 5 1      3111 0    2.2mb    2.2mb
yellow open ncms-legacy-2021.02.23             MXbFJq41RXabDVQpAiS-pQ 5 1         5 0   48.1kb   48.1kb
yellow open audit-keycloak-2021.02.27          KLKZ1q0tSN2HFbAqpAtTWw 5 1        93 0  598.4kb  598.4kb
yellow open default--2021.02.24                3AfQruBtSJ61s7-U_GndIg 5 1        47 0    264kb    264kb
yellow open default-alarm-2021.02.28           z8FG-n_tSOGIBw5ZNV2QuA 5 1     32649 0   10.7mb   10.7mb
yellow open rook-ceph-legacy-2021.02.24        C35q2GQLTSSsRTHV2vQZhg 5 1     99679 0   40.1mb   40.1mb
yellow open audit-keycloak-2021.02.25          bkKYsTE_Tga_AuRH3VJZcw 5 1      6001 0    2.2mb    2.2mb
yellow open default-counter-2021.02.27         Iz5ef_e4Q1ytGsaKc2OJsg 5 1   2977193 0    456mb    456mb
yellow open default-log-2021.02.26             JrNxlA8RRpWQYkRrsVwjNw 5 1  22916818 0      3gb      3gb
yellow open audit-keycloak-2021.03.01          nGfeo4oXSZKkBgzo2Ai70Q 5 1      3191 0    1.6mb    1.6mb
yellow open ncms-legacy-2021.02.24             YncAZ0VETM-o8kmJc7cV9Q 5 1         2 0     20kb     20kb
yellow open default-legacy-2021.03.02          7I47MwsURGSVIQdyGFyujg 5 1    258564 0   56.8mb   56.8mb
yellow open default-legacy-2021.03.01          DtM2dci4Q-SgFXwYx7zj9g 5 1   1673339 0  409.4mb  409.4mb
yellow open default-legacy-2021.02.26          wBTMVlg_R5OnRX7dzWPU1w 5 1    321289 0   61.4mb   61.4mb
yellow open audit-keycloak-2021.03.02          9Tzv_AYCR5SIB2fSZGpI-w 5 1      1793 0    1.7mb    1.7mb
yellow open default-alarm-2021.02.27           rcgmMtaqRpC_xrSdsPk7sA 5 1      6004 0    3.4mb    3.4mb
yellow open default-log-2021.03.02             suP3X5yAT9-X_ReNf7E70Q 5 1  37309828 0    5.4gb    5.4gb
yellow open default-alarm-2021.03.01           9MRAR_8AT6qsRJZW8vGZIA 5 1      2207 0    1.5mb    1.5mb
yellow open audit-keycloak-2021.02.24          MFy24lEmTr-d8NodlkTqTg 5 1     14690 0    4.4mb    4.4mb
yellow open default-legacy-2021.02.25          dFrqtx7bR0mKls-Bwk2-mA 5 1  16180335 0    3.2gb    3.2gb
yellow open default-alarm-2021.02.26           pCl9VrJ5QwiXn6UIwCdV0g 5 1      1666 0    1.4mb    1.4mb
yellow open default-alarm-2021.02.24           yyKXzkg-SYyTlXaOVvp2TQ 5 1     70948 0   24.2mb   24.2mb
yellow open default-counter-2021.03.02         nK7SVRkkSqabXG9qtOsAjw 5 1    592816 0  115.8mb  115.8mb
yellow open default-legacy-2021.02.24          KCPykydHTZCYHG5hwN_NVA 5 1  20161204 0    3.5gb    3.5gb
yellow open default-counter-2021.02.28         Qfg3sxL3QgySnqeQMNanDw 5 1   4013326 0  767.2mb  767.2mb
yellow open default-counter-2021.03.01         Fa9OOCjGSmy8Btf82gNUHg 5 1   1184835 0  238.6mb  238.6mb
yellow open default-log-2021.02.25             mgSEXemTT1q6VvmV9Prwfw 5 1 441739341 0   60.4gb   60.4gb
yellow open ${namespace}-${type}-%y.%m.%d      1VGNaqnLRRKAQWpd79Jy6g 5 1         2 0   20.7kb   20.7kb
yellow open ncms-legacy-2021.02.26             FjwbcLgxSN60DnnTnxF11A 5 1        12 0  114.3kb  114.3kb
yellow open ncms-legacy-2021.03.02             poSHQe1DTne7IgQw5YQO7A 5 1         2 0     20kb     20kb
yellow open default-log-2021.02.24             cM4PSyXyQdS5LpDoyCQ7rA 5 1 834338864 0  112.9gb  112.9gb
yellow open rook-ceph-legacy-2021.02.26        vtY_nswzQPS4jum4BX-Eow 5 1       807 0  644.9kb  644.9kb
yellow open rook-ceph-legacy-2021.02.25        WsTME1FyTMiKdRR_8I8p-Q 5 1     35310 0   15.2mb   15.2mb
yellow open rook-ceph-system-legacy-2021.02.24 WG7v4GY3To64Lirz0cRMHA 5 1        96 0  297.2kb  297.2kb
yellow open ncms-legacy-2021.03.01             stRIrFxxRHuPVAZsUbBFJg 5 1         4 0   38.7kb   38.7kb
yellow open default-legacy-2021.02.23          esT3AEXbRjuOIMZNvO-bqw 5 1      4997 0    3.2mb    3.2mb
yellow open ncms-legacy-2021.02.28             RZuo21HSQL2BIjqFKhmc2Q 5 1         1 0   10.6kb   10.6kb
yellow open default--2021.02.25                NQDOY3P8T26PQZKnC9DJnQ 5 1         4 0   20.5kb   20.5kb
yellow open rook-ceph-legacy-2021.02.23        B4mJFHkjRnaydRn-BfwFPg 5 1     52776 0   25.9mb   25.9mb
yellow open default-log-2021.02.27             bs3BMbGZRwaIXq7ThCPhWw 5 1  95330460 0   12.5gb   12.5gb
yellow open default-alarm-2021.03.02           M5dRFotoRL6Qcy2ZrnOORw 5 1      3798 0    2.4mb    2.4mb
yellow open audit-keycloak-2021.02.23          LzI-n-uYScm-9aHVev61PQ 5 1      1889 0      1mb      1mb
yellow open default-legacy-2021.02.28          4vKVDgBrT0uMCoKSm85ndw 5 1   4766181 0 1007.6mb 1007.6mb
yellow open default-counter-2021.02.24         SuVkiZsTT5GOh8ShXdY6aw 5 1  29503806 0    5.3gb    5.3gb
yellow open ncms-legacy-2021.02.27             MXqudaeSTWWjjBQv_-ORqw 5 1         4 0   38.7kb   38.7kb
yellow open default--2021.03.02                AUqX0BPESXqL9Ex0I4C3TA 5 1        13 0   53.8kb   53.8kb
yellow open default-counter-2021.02.25         8XlkA8MoSyCoV7tgz2speQ 5 1   9653953 0    1.8gb    1.8gb
yellow open rook-ceph-legacy-2021.03.02        MOpUd7JTTc6scf8llDdNMA 5 1        93 0  369.3kb  369.3kb
yellow open default-alarm-2021.02.25           IfkL9_7SRUKd2sVTCerUIQ 5 1     31264 0   12.3mb   12.3mb
yellow open audit-keycloak-2021.02.26          v2FxYFmiRziRzSfhnXYrTQ 5 1         9 0  156.9kb  156.9kb
yellow open default-log-2021.02.28             iMP42oeXTr-2E4BjTSXzcA 5 1  55503416 0    8.1gb    8.1gb
yellow open default-log-2021.03.01             jCd8KahoR4SSLcbCbNtlUA 5 1  48938825 0    7.2gb    7.2gb
yellow open default-legacy-2021.02.27          6jR7cBIoSJC953BMEQ9mQQ 5 1   1294856 0  364.2mb  364.2mb
yellow open default-counter-2021.02.26         3QsLfoKeSPi6YeXOiFpYqQ 5 1    165246 0   26.8mb   26.8mb
yellow open rook-ceph-legacy-2021.02.28        F_d_3KwzQ-WKxkJ4r7jQgw 5 1      2345 0    1.3mb    1.3mb
yellow open .kibana_1                          DQNga-jTSUWvmYwjEerezQ 1 1         1 0    3.7kb    3.7kb
yellow open audit-keycloak-2021.02.28          Njjr80BvQGGvpS1uCtBR6Q 5 1      3311 0    1.6mb    1.6mb
yellow open rook-ceph-legacy-2021.02.27        0FHUKlfKQyqVWoNNhiKpCw 5 1    101071 0   47.1mb   47.1mb

It sounds like you have a lot of shards and for most of the indices that's useless are they are so small.

The only indices I'd probably keep with multiple shards are:

  • default-log-2021.02.25
  • default-log-2021.02.24

For the others, I'd reduce everything to one shard at most. I'd may be use weekly or monthly indices as they are so small.

Not sure if it's still the case but you have 291 unassigned shards. Which explains the yellow status.

Not everything is "protected" I think in this old version. That's why upgrading is also important so you can benefit from the latest technics.

As the ingest node has configured with -Xms8g -Xmx8g, please help me to understand how many connection and what is the input data rate it can handle.
I know this question is very vague but please share if you have any benchmark for ingest nodes.

I'm afraid that the only answer can be "it depends". :grin:

You need to test that by yourself. You can use Rally to create your own benchmarks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.