Hi,
I have created a github issue for this question (Support for Custom [Linux] text file logs · Issue #7186 · elastic/integrations · GitHub), but I am also adding it here for greater visibility.
We have custom applications in our environments that store logs under /data/*
directories. We currently use standalone filebeat to collect and ship those logs to elasticsearch.
We have been trying migrate from standalone filebeat to elastic-agent and collect those logs through the agent. However, our attempts by using the existing integrations so far have been unsuccessful.
Most of those logs are in json format but some of them do not follow any standard formats. We have created custom datastreams and ingest pipelines for those logs. At the moment, to feed logs into those datastreams from filebeat, we add the fields data_stream.* through processors in the filebeat config file. Trying to do the same with the existing elastic-agent integrations is a bit more challenging.
Event though a custom windows event log intergration exists for windows, where you can specify your own dataset name when setting up the integration (see picture below), similar intergration doesn't seem to exist for linux or windows text file logs (at least none that we could find so far).
We have tried using the system/syslog integration, add the path to the custom logs and use the reroute
processor in the logs-system.syslog@custom
to reroute the logs to the appropriate data_stream but this doesn't work because the managed logs-system.syslog-
pipeline has a grok processor with specific patterns and when those patterns fail against our logs, an error is produced and no other processors below it are processed which means the log never gets to our reroute processor.
Any suggestions on how to handle this case with the elastic-agent?