New installation questions


#1

Hello, all.

Let me apologize first, as this is a cross-post. I haven't gotten complete answers to my questions in the previous post, and this is extremely important to me. Hope you don't mind too much.

Up to now, I've been running a single-node, version 2.x, ELK stack. While we have only about 100 servers shipping logs to it, it is under a bit of strain. So, now I figure it's time set up a multi-node, version 5.x, ELK stack. My intention is to do that from scratch, rather than upgrade my current instance. The following questions (and there will probably be more)) are important to my effort, as I can't seem to find the answers anywhere else (though they may be out there). Here goes:

What is the optimal, or perhaps minimum, number of nodes to include in my cluster (from what I've read, it seems to be three)?
What roles should the new nodes be set up for (Master, DAta, etc.)?
Which ELK applications live on which nodes?
How would config files for Logstash, Elasticsearch, etc. differ between nodes?
To which node(s) do the various servers ship their logs to?

Hope I'm not being a pita but, again, I really need the help to move forward.

Diggy


(Aaron Mildenstein) #2

I'm unsure of whether you got a response on your cross-post, but please don't do that. I will answer briefly here.

For a production cluster, we recommend 3 dedicated master nodes. These can be VMs, but they should not be used as anything but master nodes (which means, do not send index or search requests to these boxes).

After that, I suggest 2 or 3 data nodes to start. You can just keep adding nodes as needed from there.

Best-case scenario you wouldn't have any other applications. Elasticsearch nodes only run Elasticsearch, Logstash nodes only run Logstash, Kibana nodes only run Kibana. Next best-case: only adding "metricbeat" running on these otherwise single-purpose machines to collect performance metrics.

For Elasticsearch, the data nodes and master nodes would have those configuration options different.
For Logstash, that depends on what you're trying to accomplish, but a single Logstash box can do a lot. Just spawn more Logstash processes to fill up the CPU.
For Kibana, a single machine is also probably sufficient, or even a VM.

I'm not sure what you mean by this. If you're referring to remote systems forwarding their logs to Logstash, which then parses them in preparation for ingestion by Elasticsearch, then I would suggest that box is the target.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.