I am attempting to send documents to specific indexes so I can handle the lifecycle better with various types of documents. I just cannot get it to work when trying to involve multiple fields.
According to the processors conditions documentation this appears to be correct ut just not having much luck. It passes the checks and starts but the documents do not go to the index. If I adjust that condition to be like the Fortinet Firewall one and only use something like event.module: "iis" then it works but I need to separate the failures vs successes as well.
Just thinking.... perhaps the event.outcome field is parsed in the ingest pipeline on the elasticsearch side so it may not be available on the filebeat side for the conditional.
I sometimes use the filebeat file output to check a few lines to confirm what is available for conditionals on the filebeat side.
Wow! Looks like you are correct. I haven't got to the point of learning ingest pipelines (we just purchased Elastic and going through training with Elastic) but it looks like that is the case. This is what I see in the ingest pipeline which is creating that field based on the http status code it looks like:
We often see people trying to segment into lots of different indexes which is fine and if that's what you plan to do that's okay.
You can do that by doing the parsing in logstash and then send them to different indices
The ramifications and side effects of that are things like this you have to create all the logic for sorting you will create more indexes you will create more shards... More to manage etc
Often we see people go full circle and realize it may be a better solution to do that filtering on the search or client side... After all terms and filtering is very fast and can even be done at the role level and spaces and dashboards.
I often advise folks early on to just use the filebeat default indices until you use the system for a while and understand where you really need to segment and you don't.
That makes sense. The issue is we are using it more as a siem solution instead of a metric/debugging tool and we have data we have to gather and keep so long vs data we don't have to keep so long. So for example we want to keep logon events for 365 days but IIS access/error logs for 30 days. We would need the access/error logs for tracking down event but no necessarily keep them long term.
I have been told by Elastic that we may want to implement some Logstash servers on our end and I haven't even messed with Logstash yet. I'm probably going to go that route.
Ahhh Yes ... ILM (Index Lifecycle Management) is a good reason to segregate indices / data.
BTW @ARDiver86 if you are looking at long term retention you should perhaps look at Cold / Frozen backed by searchable snapshot. That is a enterprise licensed feature but depending on the deployment model you may end up with a lower TCO (HW + SW) as the frozen is so cost efficient with respect to long term storage. And that storage is still searchable without having to re-hydrate the data.
That was my plan but we started out with only two nodes for now running both hot data and ML while we test the waters and see if this a viable product we can market to customers. I wanted cold nodes offsite but since you have to pay per node we decided on starting with only two.
Just to set expectations you should never run a cluster with some nodes in one data center and other nodes in other data centers that will not work / be unstable. There are other architectural patterns I would consider. Certainly you can ship snapshots offsite as a back up plan.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.