ECK Autoscaler Object Race Condition

One of our customers uses helm charts to deploy elasticsearch on ECK which has been working correctly. They recently added autoscaler to the helm chart and they get inconsistent results. Sometimes the deployment works, other times it appears the autoscaler object fails and prevents everything from starting up. In these instances the entire deployment appears to "hang" with only the log shown below available.

{"log.level":"info","@timestamp":"2023-07-18T19:11:23.518Z","log.logger":"elasticsearch-controller","message":"NodeSet managed by the autoscaling controller but not found in status","service.version":"2.8.0+3940cf4d","service.type":"eck","ecs.version":"1.4.0","iteration":"179","namespace":"dataplane-ek","es_name":"dataplane-ek","nodeset":"master"}

Is this a known issue or are we making a mistake of some kind? I appreciate your help.

In theory, this log is harmless (hence the info level) and should disappear by itself quite quickly. This is to avoid situations where some pods have been manually deleted or replaced by an external event from the operator point of view.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.