Deploying elasticsearch 7.17.4 into kubernetes {rancher}, and have noticed that elastic doesn't form a cluster, it partitions into 3 separate master nodes.
How do i increase the level of debugging on the masters, as all the docs seem to be wrong.
Secondly how can I verify that my baremetal cluster isn't the issue given, if deploy the same helm chart in eks, everything works like a charm.
Is there anything special I'd have to a baremeta cluster?
So there's an issue, the logs only show one node joining the a cluster essentially itself. All 'nodes behave like this, it seems that they cannot resolve the respective node names and assume single node discovery.
Yet when I use getbyhostname to verify if resolution works, it appears to be fine. Explicitly setting the discovery mode makes no difference.
Is there a difference between the way the rpm and single non rpm binary works? My image uses the rpm instead of the compressed tar binary. I'm deploying 7.17.4.
Unfortunately due to company rules I can't send you the data you require, however what I can tell you is that the deployment is a statefulset in k8s and also the kubernetes deployment is bare metal running Rancher.
So my question is, given this isn't on a cloud provider how does the discovery seeding work, as there is no api as far as I'm aware of to leverage? I'm using the Elasticsearch helm chart for deployment for 7.17.x
Below is a section of the rendered chart related to discovery.
It looks like you set discovery.seed_hosts: elasticsearch-master-headless which means Elasticsearch will do a lookup for this name and use all the addresses in the response for discovery.
The behaviour in a aws EKS deployment is somewhat different, in that the cluster is formed properly, but presumably in the case of AWS it takes advantage of api, in the case of a bare metal rancher cluster, I'm guessing this going to behave in a slightly different way?
Elasticsearch doesn't do anything different, it does a lookup for the name(s) you configure and uses all the addresses in the response. The specific library function it calls for this lookup is getaddrinfo() which can be configured to behave differently in different environments (usually it uses /etc/hosts and DNS but many other options are available). You'll need to ask a local expert for the details of how name lookup works in your environment, sorry, that's not something I can help with here as it's not really anything to do with Elasticsearch.
I guess the follow up question, given my cluster is baremetal should I configuring my chart differently in terms of discovery ?
Secondly, what else can I do from an elastic perspective to further debug this, the logs only show that a single node cluster has been formed it makes no mention of any other nodes. So I have to conclude that I need to treat a baremetal deployment in a different way ?
Well given its k8s cluster that would be coreDNS . is there any other debug available to me which I can turn on to get a better idea about whats going on ? The logs are ok ish but are definitely light on the discovery process .
Yes that clearly works in the context of a cluster built in a non kubernetes environment. I'm specifically talking about a kubernetes environment with ephemeral nodes ? I presume from the short replies, that elastic on kubernetes let alone bare metal k8s isn't something that a lot of people know a great deal about ?
and that's the problem, its a very simple question, given I have ephemeral containers in k8s how can I debug a discovery problem with the instructions from your docs which are clearly aimed at a non k8s deployment.
Second question how can I increase the debug levels in the logs, such that I can see what happens during discovery.
Unfortunately the cloud option is not a viable option for me.
I don't think you have a discovery problem, because ...
... there is no need to adjust any logging levels to diagnose discovery problems in 7.17. If you were having a discovery problem, the logs would already be full of debugging information about it.
Instead, I think you're having a cluster bootstrapping problem, and the docs I linked above tell you how to both diagnose and fix it. These docs apply to all environments, there's nothing about them which aims at any particular setup.
ECK is something you can run on your own local K8s environment, effectively a private cloud.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.