Target node question


Just a question. The amount of data my old (6.7 / 6.8) platform needs to process explodes.
So I'm busy tuning that until my new (7.x) platform is ready.

Now I have a bunch of indexes of which 2 or 3 are large and a number less large, although together they are larger than each of the largest.

I have 3 logstash servers. Each of those 3 has a bunch of index pipelines running towards an elastic node.

Previously everything went to 1 dedicated coordinating node, but I have now stepped away from that. I have 6 data nodes. I have now set the pipelines that send the most data (largest indexes) to a dedicated node. The rest is still goes to the remaining node.

Now we are internally discussing whether this is the most convenient solution, or whether we should simply keep 1 data node per logstash server instead of pointing the pipelines to different nodes in the logstashconf?

Lets say we have logstash1, logstash 2 and logstash3
each logstash knows pipeline1,pipeline2,pipeline3,pipeline4 and pipeline 5

indexes of 2 and 4 are large, the others are small to medium
We have the datanodes node1, node2 and node3

What I have now is pipeline1 goes to node3, pipeline2 (big) goes to node2, pipeline3 goes to node 3, pipeline 4 (big) goes to node1 and pipeline 5 goes to node 3

This is a bit simplyfied, in my real setup I have 6 datanodes, so I gave the big pipelines on different logstashes a dedicated node as much as I could.

Is there any benefit in doing it the way we did now?

Or should I just point all pipelines on logstash1 to node1, all lines on logstash2 to node 2 and the same for 3?

Per the docs;

Each Elasticsearch output is a new client connected to the cluster

I think your splitting makes sense for what you have.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.