This is more or less a duplicate of the problem described here: Filebeat include is not working when logs are in both json and non-json format. It was closed without resolution .
We are trying to parse JSON content in live Jenkins build logs, we only want the JSON bits, not all the extra Jenkins log text.
prospector config:
- type: log
paths: "/var/jenkins_home/jobs/*/jobs/*/jobs/test/builds/*/log"
json.keys_under_root: true
fields_under_root: true
json.message_key: job_facts.BuildNumber
include_lines:
- '^{'
Test Jenkins output we're trying to parse:
Started by user Weiss, Chris
[Pipeline] node
Running on c5275fad5e1d-46ed17a9 in /var/swarm-client/workspace/Release_Engineering_Core/sandbox/test
[Pipeline] {
[Pipeline] stage
[Pipeline] { (Preparation)
[Pipeline] echo
{"job_facts.BuildNumber": "24"}
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Finished: SUCCESS
If we have the include_lines statement, we get no output to Elasticsearch.
If we remove it, we get both the (correctly deconstructed) JSON but also the unwanted non-JSON log entries.
Either way, the filebeat log displays "Error decoding JSON" entries for each non-JSON line in the log.
I've also tried many variations on the "include_lines" statement:
include_lines: ['^\{']
include_lines: '^{'
etc...
I should add that we're in control of the JSON structure of the content in the Jenkins logs, if the JSON is not formatted correctly for Filebeat, we can fix that.