Hi, i had a power issue and the cluster had been down for nearly 10 hours.
Now I have the following outputs:
http://serverip:9200/_cluster/health
cluster_name | "cluster-name" |
---|---|
status | "yellow" |
timed_out | false |
number_of_nodes | 13 |
number_of_data_nodes | 12 |
active_primary_shards | 434 |
active_shards | 716 |
relocating_shards | 9 |
initializing_shards | 0 |
unassigned_shards | 152 |
delayed_unassigned_shards | 0 |
number_of_pending_tasks | 70 |
number_of_in_flight_fetch | 0 |
task_max_waiting_in_queue_millis | 167 |
active_shards_percent_as_number | 82.48847926267281 |
Checking the logs, i have found these issues:
Caused by: org.elasticsearch.transport.RemoteTransportException: [hostname6][10.240.36.150:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [4077609546/3.7gb], which is larger than the limit of [4013975142/3.7gb], real usage: [4077577728/3.7gb], new bytes reserved: [31818/31kb], usages [request=0/0b, fielddata=1626172697/1.5gb, in_flight_requests=32648/31.8kb, accounting=776288601/740.3mb]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:342) ~[elasticsearch-7.3.2.jar:7.3.2]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.3.2.jar:7.3.2]
...
[2019-11-05T15:15:57,947][WARN ][o.e.i.c.IndicesClusterStateService] [hostnameB01] [clustername_metrics-raw_2019.09.13][0] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [clustername_metrics-raw_2019.09.13][0]: Recovery failed from {hostnameb6}{-TNL0xnsTpuWDzooT9bJtw}{D5PdSuGhQfy3ELk2a7_DZg}{10.240.36.150}{10.240.36.150:9300}{di}{ml.machine_memory=8356679680, ml.max_open_jobs=20, datacenter=b, xpack.installed=true} into {hostnameb1}{qOzCQhsPSD-MjRCyt8oSew}{gbiKUpwKS3iGxPtcRJFh8A}{10.240.36.145}{10.240.36.145:9300}{dim}{ml.machine_memory=8356687872, xpack.installed=true, ml.max_open_jobs=20, datacenter=b}
Could you help me solving this?
Thanks, Rodrigo.