Cluster health not reported accurately in ECE

mesiasc · November 7, 2019, 1:44pm

I have clusters where the health is degraded by indices in read only state. Clicking through to the individual deployment UI for those clusters, the health is often not reportedly the same state.
Within ECE deployments top level page and within specific deployment UIs separately, the health is consistent, so I don't think the status is fluctuating. ECE top page seems to have a snapshot from some time in the past that bears little relation to the current status in Kibana.
Clicking through to Kibana, one cluster that is Unhealthy at the top level and is Healthy at the individual level, has lifecycle errors against several indices.

As context, the readonly state was probably induced by running out of storage. This has been addressed by deleting excessive uncompressed log files. The cluster in question has had:
PUT _all/_settings
{
"index": {
"blocks": {
"read_only_allow_delete": "false"
}
}
}
to clear the readonly status. I believe there are no longer readonly indices on this cluster.

What is going on with cluster health reporting?

Alex_Piggott · November 7, 2019, 10:00pm

What is the criteria that leads you to believe that there are no longer readonly indices on the cluster, did you check via API?

(I think we get the info we use from _cluster/state)

mesiasc · November 8, 2019, 9:52am

Alex,

thanks for the reply, after freeing up space I used the ECE UI to run the PUT request above, and eventually the top level UI showed cluster health as good in line with the detail pages below. I got the feeling there was some caching effect on the top level page as it was not updating in a timely way.

I also noticed that clusters that had been terminated then deleted were still showing at the top level, but on clicking through there would be an error message.

Perhaps if the ECE UI relies on the clusters that are built by default, they also are compromised when the disk thresholds are pushed.

system · November 22, 2019, 9:52am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Indices Cluster Health API Elasticsearch	5	1237	July 5, 2017
Incorrect status on Cluster Overview Elastic Cloud Enterprise (ECE)	10	851	July 3, 2018
Cluster health vs indices health Elasticsearch	3	371	April 17, 2019
Elastic Cloud Console Showing Unhealthy Deployment Elasticsearch	10	1842	March 24, 2022
ECE C Elastic Cloud Enterprise (ECE)	2	476	July 24, 2019

Cluster health not reported accurately in ECE

Related topics