Hi, I'm using Elastic Cloud on Kubernetes (version 1.1.2 according to the operator). I want to have a coordinating-only node in my cluster. For the config, I have set node.master, node.data, node.ingest, node.ml, node.transform and node.remote_cluster_client to false.
The node in my cluster is now Online and doesn't have any shard which makes sense. I'm opening some Dashboards in Kibana and there's nothing appearing in Logs for this node. I'm wondering if it's doing its job. Maybe I need to connect Kibana directly to this node in some way or something else is missing?
I want to be sure it's doing its job meaning doing all the coordinating job for the cluster. How can I verify that?
Same problem for my clients, we would like our APM server points to "inject node" in ECK. But "elasticsearchRef" only point to elastic cluster service. Hope elastic can provide a sample configuration.
When you use elasticsearchRef to connect Stack applications like Kibana or APM Server to an Elasticsearch cluster, the requests go through the HTTP service created by ECK (<cluster_name>-es-http). If you want the coordinating nodes to handle all requests, you should update the HTTP service selector to include only the coordinating nodes. For example:
I am not aware of a simple way to verify that the coordinating node is actually coordinating things. You can run kubectl describe svc <cluster_name>-es-http and confirm that the endpoints list only contains your coordinating nodes. That should give you confidence that any requests you send to the cluster via that service are coordinated by those nodes.
If you have a service mesh installed, you can probably look at the proxy logs to see the requests coming through to the coordinating nodes as well.
We also don't want to loose High Availability... I was planning to create a single coordinating-only node, but I should have at least a second one if I put in place what you're saying.
By the way, I'm wondering if connecting Kibana directly to coordinating-only nodes is really the best practice?
How Elasticsearch works regarding the handling of requests coming in a service containing several nodes... is a random node chosen to do the coordinating work? Is it possible to say to Elasticsearch to use a specific node as preference to coordinate things and only when this node is Offline, the other nodes (with master/data/etc. roles) can take over?
As you said, having a single coordinating node and letting it handle all traffic is probably not good for availability and performance. There's no particular advantage of letting Kibana communicate through the coordinating nodes either. I was just illustrating how it is possible to use a set of dedicated request handling nodes -- if that's what you desire.
How you should configure traffic handling in your cluster very much depends on your particular use case and how busy your cluster is. The default behaviour of ECK is to expose all nodes in the cluster through the service and that works well for most use cases. From that starting point, you can explore other things like excluding master nodes from handling traffic, directing ingest traffic to ingest nodes etc. That would require creating new services as described in Traffic Splitting | Elastic Cloud on Kubernetes [1.3] | Elastic
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.