Thanks to all for creating the Elastic stack (ELK) - it enables small businesses like my own to have visibility of security events and performance data that is too costly or inflexible with proprietary tools.
I have been using ELK stack since 2013 and am changing the way that we consume and process events, namely for our events to better survive changes to index templates / ES version changes / grok pattern problems I am want to send all events to Amazon S3 as an intermediate stage before they end up in Elasticsearch.
For example : Filebeat => Logtash => S3 => Logtash (filters, etc) => Elasticsearch
Some thoughts / questions:
- Is this a valid architecture? My aim is to be able to work with my events in a more flexible way (i.e: change mappings more easily, work with smaller or larger data sets more easily) by simply importing portions of data and indexing them from S3
- Where should filtering happen? on the first logstash instance or the second? Is the intermediate S3 stage going to have negative effects on my grok patterns or mappings?
- I currently have some events that make it through to elasticsearch if I send them directly but silently get dropped if I send them to S3 first. They are using a grok filter.
Any help or advice is much appreciated...see my configs below: