Elasticsearch-Output to Elasticsearch-Cluster

Hello there,

i have an Elasticsearch-Cluster with three nodes. When i import the data to the cluster with the Elasticsearch-Output-Plugin, which node do i choose as Destination?

Is there a common expression for the host (maybe Cluster-Name). Or should i mention all three nodes in the host field? Or only one?

We have always just mentioned all of our data nodes.

My cluster is not very big. I have only three nodes and no specific data or master node. What if node 1 is a data node and I send data with logstash to it. And then the master goes down so that node 1 is the master. Is there a error because I send now the data to the master node, or is there a automatic handling ?

There should be automatic handling. For a smaller cluster you should be able to send data to whatever node is available. But as you get larger it is recommended that you separate their duties more.

Take a look at this webpage. In particular the section about coordination nodes.
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#coordinating-node

Requests like search requests or bulk-indexing requests may involve data held
on different data nodes. A search request, for example, is executed in two
phases which are coordinated by the node which receives the client request — the coordinating node.
In the scatter phase, the coordinating node forwards the request to the data
nodes which hold the data. Each data node executes the request locally and
returns its results to the coordinating node. In the gather phase, the
coordinating node reduces each data node’s results into a single global
resultset.
Every node is implicitly a coordinating node. This means that a node that has
all three node.master, node.data and node.ingest set to false will
only act as a coordinating node, which cannot be disabled. As a result, such
a node needs to have enough memory and CPU in order to deal with the gather
phase.

There is a warning on Master nodes though. While it should work, it isn't a good idea to have the Master node spend resources helping out with the searching and indexing of data. This extends to more than just coordination data, but also storing data itself. It is recommended to have Master and Data nodes be completely separate.

While master nodes can also behave as coordinating nodes
and route search and indexing requests from clients to data nodes, it is
better not to use dedicated master nodes for this purpose. It is important
for the stability of the cluster that master-eligible nodes do as little work
as possible.

2 Likes

Thanks for your advice =)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.