Just a question. The amount of data my old (6.7 / 6.8) platform needs to process explodes.
So I'm busy tuning that until my new (7.x) platform is ready.
Now I have a bunch of indexes of which 2 or 3 are large and a number less large, although together they are larger than each of the largest.
I have 3 logstash servers. Each of those 3 has a bunch of index pipelines running towards an elastic node.
Previously everything went to 1 dedicated coordinating node, but I have now stepped away from that. I have 6 data nodes. I have now set the pipelines that send the most data (largest indexes) to a dedicated node. The rest is still goes to the remaining node.
Now we are internally discussing whether this is the most convenient solution, or whether we should simply keep 1 data node per logstash server instead of pointing the pipelines to different nodes in the logstashconf?
Lets say we have logstash1, logstash 2 and logstash3
each logstash knows pipeline1,pipeline2,pipeline3,pipeline4 and pipeline 5
indexes of 2 and 4 are large, the others are small to medium
We have the datanodes node1, node2 and node3
What I have now is pipeline1 goes to node3, pipeline2 (big) goes to node2, pipeline3 goes to node 3, pipeline 4 (big) goes to node1 and pipeline 5 goes to node 3
This is a bit simplyfied, in my real setup I have 6 datanodes, so I gave the big pipelines on different logstashes a dedicated node as much as I could.
Is there any benefit in doing it the way we did now?
Or should I just point all pipelines on logstash1 to node1, all lines on logstash2 to node 2 and the same for 3?