Hello,
This is more broad than Elasticsearch, but I don't see a category for questions about the whole stack.
I need to setup a central logging cluster in a relatively small environment. We have about 500 servers that primarily serve 3-tier web applications, and file shares. Our servers are pretty evenly split between Linux (mostly RHEL) and Windows (mostly Server '16).
My requirements are fairly simple:
-
Ship security logs to QRadar, controlled by InfoSec, in syslog formatting. (I have no influence over QRadar, so even if it can accept other message types, it's not gonna happen.)
-
Ship security and system logs to our department's logging cluster, and store them for a minimum of 90 days (180 or more would be preferred, though). Including application logs is an option, but not a requirement.
I have a general plan in mind, but I'm not sure it's the most efficient way of going about this, and I'd like to get some advice.
Windows servers will send events to a central event collector that will then send to logstash to convert the messages to syslog for QRadar, and also output to Elasticsearch. I assume Winlogbeat is the best option there.
For our Linux environment, I wanted to use auditbeat to ship security information to logstash, so I could then forward to both QRadar and Elasticsearch. I couldn't get auditbeat to log locally and send to logstash, and I couldn't get auditd and auditbeat to run together. I read a message from one of the devs on Github saying that running them together wasn't the intended use, anyway, so that's fine, and RHEL 7 uses kernel 3.10.*, so I guess that's expected.
My current plan is to send auth,authpriv, and whatever else I need to send to QRadar through rsyslog, and maybe use filebeat for sending to Elasticsearch, which brings me to my questions.
-
What are the benefits to using Filebeat instead of sending rsyslog to ingest nodes to do the parsing (besides reduced computational overhead on the logging cluster)?
-
I'm extremely confused about the index and shard allocations. If I use Filebeat to send to Elasticsearch, do I need to configure the Elasticsearch templates in Filebeat and Elasticsearch, or can I manage that entirely from Elasticsearch? (Sorry if this is answered in the docs, but I've read through a sizeable chunk of the documentation for several components, and I've found information on how they work together to be a bit sparse)
-
Given the described environment (primarily 3-tier web services, and file shares), what kind of sizing requirements might I be looking at?
I know that's a really difficult question to answer, but from what I've seen, the requirements might be a lot higher than what we've anticipated.
I'm actually rebuilding the cluster because it was not functioning correctly after the guy I replaced built it. I thought it'd be better to start with a fresh build since I've never worked with this before, and we didn't need the data. The point is, we had a bunch of servers pointed at the three servers I'm using, and when I turned logstash on and wrote to a file, it grew to around 40G in about an hour.
Does Elasticsearch somehow utilize space better than Logstash, or is it reasonable to expect a similar data flow in both applications?