I use custom nginx log format, with some additional fields, so current pipeline in Filebeat 7.x nginx module fails with "grok parse failure".
My platform is CentOS 7.x.
Now, I'm thinking of what is the best way to reuse the module and only change the pipeline parser line?
Option 1:
Overwrite default.json with my custom one.
Pros: quick and easy
Cons: have to be re-done after each upgrade, so simple yum update doesn't suffice any more, plus it's not a good practice overwriting files in /usr.
Option 2:
Copying whole nginx module to mycom_nginx and changing default.json there, enabling mycom_nginx and disabling nginx module.
Pros: quick and easy
Cons: have to keep the module up to date whenever it's changed upstream, plus have to understand all the details of the module like machine learning part.
I don't like either of these two... but I can't figure out how to still use the nginx module, but just specify different pipeline file in /etc/filebeat/modules.d/nginx.yml.
The problem is that you are not using Nginx module at the end so, any solution will involve maintaining code. Your first option maybe is less prone to errors. The key thing here is that the pipeline is a JSON file where you can update the array with a new Grok pattern writing a simple script that you can run every time you update.
You can also try to use some processors in the input part (before the module) if you feel like you can "extract" the non standard data from the incoming line before it reaches the processor. https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html Those processors are Filebeat processors, do not confuse them with Ingest Node processors
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.