Hi Elasticsearch team,
Environment
- Elasticsearch OSS 6.7.2
- 3 Master Nodes
- 3 Ingest Nodes
- 9 Hot-data nodes
- 6 Warm-data nodes
My cluster is running but the state is always yellow.
There are unassigned shards and they are not to be initialized.
I noticed that translog always initialized.
{
"cluster_name" : "elasticsearch-preprod",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 21,
"number_of_data_nodes" : 15,
"active_primary_shards" : 786,
"active_shards" : 1559,
"relocating_shards" : 0,
"initializing_shards" : 6,
"unassigned_shards" : 14,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 98.7333755541482
}
Every line is displayed in 10 seconds.
The columns are index, time, type, source node, target node, files percent and translog_ops_percent (_cat/recovery).
a200035-syslog-2019.25 30.5s translog 10.49.112.16-hdata-node-0 10.49.113.233-hdata-node-0 100.0% 30.0%
a200338-syslog-2019.25 42.8s index 10.49.113.238-hdata-node-0 10.49.115.140-hdata-node-0 99.3% 0.0%
a202038-iis_w3c-tracs-2019.25 55.4s translog 10.49.113.238-hdata-node-0 10.49.117.39-hdata-node-0 100.0% 42.4%
a202939-syslog-2019.25 1.3m translog 10.49.113.233-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 73.8%
a203893-cbe_event-2019.25 1.3m index 10.49.112.16-hdata-node-0 10.49.115.140-hdata-node-0 95.2% 0.0%
a204409-json-txappsrv-2019.25 2.6m translog 10.49.115.140-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 52.9%
a204744-syslog-2019.25 18.8s index 10.49.115.140-hdata-node-0 10.49.113.238-hdata-node-0 93.9% 0.0%
----------------------------------------------------------------------------------------------------------------------
a200035-syslog-2019.25 44.6s translog 10.49.112.16-hdata-node-0 10.49.113.233-hdata-node-0 100.0% 44.5%
a200338-syslog-2019.25 56.9s translog 10.49.113.238-hdata-node-0 10.49.115.140-hdata-node-0 100.0% 6.3%
a202038-iis_w3c-tracs-2019.25 1.1m translog 10.49.113.238-hdata-node-0 10.49.117.39-hdata-node-0 100.0% 55.4%
a202939-syslog-2019.25 1.5m translog 10.49.113.233-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 86.7%
a203893-cbe_event-2019.25 1.5m index 10.49.112.16-hdata-node-0 10.49.115.140-hdata-node-0 96.7% 0.0%
a204409-json-txappsrv-2019.25 2.8m translog 10.49.115.140-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 59.2%
a204744-syslog-2019.25 32.9s translog 10.49.115.140-hdata-node-0 10.49.113.238-hdata-node-0 100.0% 2.6%
----------------------------------------------------------------------------------------------------------------------
a200035-syslog-2019.25 58.8s translog 10.49.112.16-hdata-node-0 10.49.113.233-hdata-node-0 100.0% 54.9%
a200338-syslog-2019.25 1.1m translog 10.49.113.238-hdata-node-0 10.49.115.140-hdata-node-0 100.0% 13.0%
a202038-iis_w3c-tracs-2019.25 1.3m translog 10.49.113.238-hdata-node-0 10.49.117.39-hdata-node-0 100.0% 67.2%
a202939-syslog-2019.25 1.8m translog 10.49.113.233-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 92.8%
a203893-cbe_event-2019.25 1.8m index 10.49.112.16-hdata-node-0 10.49.115.140-hdata-node-0 97.6% 0.0%
a204409-json-txappsrv-2019.25 3m translog 10.49.115.140-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 62.1%
a204744-syslog-2019.25 47.1s translog 10.49.115.140-hdata-node-0 10.49.113.238-hdata-node-0 100.0% 11.7%
----------------------------------------------------------------------------------------------------------------------
a200035-syslog-2019.25 1.2m translog 10.49.112.16-hdata-node-0 10.49.113.233-hdata-node-0 100.0% 67.3%
a200338-syslog-2019.25 1.4m translog 10.49.113.238-hdata-node-0 10.49.115.140-hdata-node-0 100.0% 21.9%
a202038-iis_w3c-tracs-2019.25 1.6m translog 10.49.113.238-hdata-node-0 10.49.117.39-hdata-node-0 100.0% 81.5%
a202939-syslog-2019.25 2m finalize 10.49.113.233-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 100.0%
a203893-cbe_event-2019.25 2m index 10.49.112.16-hdata-node-0 10.49.115.140-hdata-node-0 98.6% 0.0%
a204409-json-txappsrv-2019.25 3.3m translog 10.49.115.140-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 66.3%
a204744-syslog-2019.25 1m translog 10.49.115.140-hdata-node-0 10.49.113.238-hdata-node-0 100.0% 23.1%
----------------------------------------------------------------------------------------------------------------------
a200035-syslog-2019.25 1.4m translog 10.49.112.16-hdata-node-0 10.49.113.233-hdata-node-0 100.0% 82.0%
a200338-syslog-2019.25 1.6m translog 10.49.113.238-hdata-node-0 10.49.115.140-hdata-node-0 100.0% 31.6%
a202038-iis_w3c-tracs-2019.25 1.8m translog 10.49.113.238-hdata-node-0 10.49.117.39-hdata-node-0 100.0% 99.7%
a203893-cbe_event-2019.25 2.3m index 10.49.112.16-hdata-node-0 10.49.115.140-hdata-node-0 99.5% 0.0%
a204409-json-txappsrv-2019.25 3.5m translog 10.49.115.140-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 72.1%
a204744-syslog-2019.25 1.2m translog 10.49.115.140-hdata-node-0 10.49.113.238-hdata-node-0 100.0% 34.7%
----------------------------------------------------------------------------------------------------------------------
a200035-syslog-2019.25 1.7m translog 10.49.112.16-hdata-node-0 10.49.113.233-hdata-node-0 100.0% 96.7%
a200338-syslog-2019.25 1.9m translog 10.49.113.238-hdata-node-0 10.49.115.140-hdata-node-0 100.0% 38.9%
a203893-cbe_event-2019.25 2.5m translog 10.49.112.16-hdata-node-0 10.49.115.140-hdata-node-0 100.0% 2.8%
a204409-json-txappsrv-2019.25 3.8m translog 10.49.115.140-hdata-node-0 10.49.112.16-hdata-node-0 100.0% 78.8%
a204744-syslog-2019.25 1.5m translog 10.49.115.140-hdata-node-0 10.49.113.238-hdata-node-0 100.0% 44.7%
----------------------------------------------------------------------------------------------------------------------
The number of shards on data nodes (hot box).
10.49.112.16: 173
10.49.113.233: 174
10.49.113.238: 175
10.49.115.1: 174
10.49.115.140: 174
10.49.114.29: 174
10.49.117.39: 173
10.49.117.131: 175
10.49.117.62: 174
My cluster is in this state for 7 hours. What should I do?
Regards,
Worapoj