I have two different applications running on two different servers, shipping logs and exposing server API information to Logstash.
The output from each application has a different data rate. Let's say:
- App1 sends data @ 1x rate
- App2 sends data @ 15x rate
Each app sends 4 different types of logs (json/log4j, different fields, etc.).
Also the shipping method for the 4 datasets is different, 2 come from filebeat and 1 from the jdbc_input plugin.
Mixing the data from the two apps is not a current priority.
Currently, for App1, I have 9 logstash configuration files:
4 datasets x 2 files (input, filter) + 1 common output file = 9 config files
Namely:
- Filebeat --> Grok --> Elasticsearch
- Filebeat --> Grok --> Elasticsearch
- JDBC Input (Postgres) --> Mutate/Replace --> Elasticsearch
- HTTP Plugin (public API) --> N/A --> Elasticsearch
I wonder what's the best practice for onboarding App2.
Some of the questions I have, are:
- How can I develop my logstash configuration for App2 without impacting the operation of App1, i.e. not having to restart logstash while I develop my grok filters?
- Should I be using multiple pipelines? If so, should I join all config files (input, filter, output) into one or do pipeline-to-pipeline (in beta)?
- If I'm running a single node (which I am), what's best from a performance perspective? (introductory note seems to argue that pipelines are the way to go in this case)
- Should I use 1 pipeline per data set or per application/server (data source)?
- How should I think about tradeoffs between complexity of pipelines, performance and ability to keep onboarding new datasets from new apps?