I'm fairly new to Elastic Stack and try to setup a lab environment for Proof-of-Concept. I've watched almost every Getting Start video on de Elastic website and read documentation thorougly on installing my setup. So I know the components and config used.
I've managed to have a working setup with Filebeat (system and apache2 modules), Logstash, Elasticsearch and Kibana. Each of the components is working, and I can discover the harvested logs in Kibana.
But, there is one thing I can't understand even after testing and reading documentation for hours now. And that is the dimensions in which Logstash Pipelines and Elasticsearch Indices relate to eachother. I have a working log flow, but I don't understand the dimension to scale in when it come to the complexity / scope of logging.
Situation: I've enabled the Filebeat modules system and apache2. I've got one beats.conf with input type beats, and output to elasticsearch. When I do no grok filtering, my apache log messages are not split. But when I configure grok filtering, the system logs are tagged with grokparsefailure. I understand the nothing vs. all situation here. But I can't get my head right on choosing the right scaling directions from this point on. And I can't find documentation on this matter.
Questions I have:
- With multiple enabled filebeat modules, do I need multiple Logstash pipelines (on different ports)?
- With multiple enabled filebeat modules, do I need multiple Elasticsearch Indices to store the different filebeat modules seperatly?
- Can anybody explain this in a logical manner like each beat module requires a custom Logstash pipeline? Or each pipeline requires a custom Elasticsearch Index?
Or can anybody point out a piece of documentation on these dimenions in designing pipelines and indices?
Thanks in advance!