Elasticsearch on yarn : Node affinity

Saibal_Patra · June 3, 2015, 6:57pm

Current documentation states:

Currently, Elasticsearch on YARN does not provide any option for tying Elasticsearch nodes to specify YARN nodes however this will be addressed in the future. In practice this means that unless YARN is specifically configured, there are no guarantees on its topology between restarts, that is on what machines Elasticsearch nodes will run each time.

I have a 30 data node Hadoop Cluster. If I start elasticsearch in all 30 nodes, after restart I will get back the data, so I can point kibana to any one of the nodes and it will work seamlessly.
But if I select to start in only 5 nodes there is no guarantee which 5 nodes it will restart and so kibana has to be recongured every time.

How do I configure YARN to solve this problem? Thanks!

costin · June 4, 2015, 5:26pm

You can't, at least not in the current YARN form. This is one of the reasons why the YARN support is still in Beta; however YARN is being updated towards supporting long-running services / server and this feature is on the roadmap.
No idea though when and if it will happen.

Nww_Pot_Fung_Nng · June 27, 2016, 1:54am

If one of the 30 nodes is restarted or a new node is added, do we need to do anything to start the Elasticsearch on that particular node?

Could you please help? Thanks.

Topic		Replies	Views
How to configure Yarn to start my existing Elasticsearch Cluster? Elasticsearch es-hadoop	1	867	July 6, 2017
Where is the data stored? ElasticSearch YARN Elasticsearch	5	583	July 5, 2017
Elasticsearch on YARN problem to identify GLOBAL URI Elasticsearch es-hadoop	2	1246	July 6, 2017
Accessing ES in Hadoop Elasticsearch	4	556	July 6, 2017
Elasticsearch on Yarn Elasticsearch	1	323	July 6, 2017

Elasticsearch on yarn : Node affinity

Related topics