We have multiple instances of the same service. Consider, for example:
sudo service my_service_1 start
sudo service my_service_2 start
sudo service my_service_3 start
with log files respectively at:
/var/log/my_service/instance_1/*.log
/var/log/my_service/instance_2/*.log
/var/log/my_service/instance_3/*.log
(to complicate things, we also have logs at /var/log/my_service/*.log
)
In each location, the structure of the logs is identical. So, at the very least, I'd like to write a single module (ingestion pipeline, fields, etc.) that ingests all 3 instances but makes them separately searchable in Elasticsearch. For example, if my ingestion pipeline parses GC pauses, I'd like to search and analyze GC pauses for a single instance only (e.g., /var/log/my_service/instance_2/*.log
without GC pauses for /var/log/my_service/instance_1/*.log
or /var/log/my_service/instance_3/*.log
included in the results).
Keep in mind that each instance has multiple types of log files, so there will be multiple filesets for each instance. I can't use filesets as a proxy for instances. The gist of this is that I'd like to have another hierarchical level between module and fileset.
First, if there's a preferred, canonical way to do this, please let me know.
If there isn't, I have a few thoughts about ways to approach this.
- derive an "instance" variable from the path of the log file being ingested and label parsed values
{module}.{instance}.{fileset}.field
instead of{module}.{fileset}.field
. - set an "instance" variable in the configuration and have multiple modules (e.g.,
my_service_instance1
,my_service_instance2
, etc.) use a single set of ingestion pipelines. To do this, I would configure the pipeline conventionally inmy_service_instance1
and refer to it in the manifest formy_service_instance2
-- e.g.,ingest_pipeline: ../../my_service_instance1/{{fileset}}/ingest/pipeline.json
. Something like that.
Any thoughts are greatly appreciated.
P.S. I'm currently on a deep dive into testing filebeat module development and would be happy to help in your efforts to document this in any way you might find useful. Here's one thought: a huge help was discovering that I can ask the tests to generate expected log files from the test log files.