Exclude lines regex for excluding all non json logs is not working

Dear Elastic team,

My requirement is to exclude non JSON lines from the file. Data comes into the log file are mainly json and the third-party libraries sometimes emit non-JSON single and multiline logs. JSON logs are single line only.

When used exact string in regex like exclude_lines: ['Resolving eureka endpoints','Fetching config from server','Located environment'] the lines starting with these words are getting excluded. If regex ( ['^[^({).*].*'] ) is used then all lines including JSON are skipped.

It is difficult to put exact phrases to skip, as many of these logs are coming from third-party dependency libraries.

Seen the questions in the forum about how to exclude lines based on regex. But couldn't figure out.

Beat version : 6.6.0 ; OS : Centos 7
Thanks in advance.

Do your JSON log lines always start with { or [?

Thanks, Shaunak.
Our json logs always starts with {

So what about something like this?

exclude_lines: ['^{']

Sorry Shaunak,

with exclude_lines: ['^{'] all lines including json and non json are picked up and sent to elastic search

Hmmm, that's quite odd. Would you mind posting a few sample lines from your logs, showing a mix of JSON and non-JSON log lines? Please remember to redact any sensitive information.

Dear Shaunak,

My log file contains the following data. Only the first few lines are shown here. The pattern is the same all lines. All lines including JSON are single line.

I have tried the regex testers online ( https://regex101.com/ ) with other pattern ^[^{]* It is working as we need.
But when placed as it is like exclude_lines: ^[^{]* skipping all the lines.
Sorry for posting other domain links here.


Log file :
Resolving eureka endpoints via configuration
Fetching config from server at: http://ip:port
Located environment: name=demo-application, module-sql, profiles=[dev], label=dev, version=2c44c9204c072bc49611ee82d376d140f54da909, state=null
Resolving eureka endpoints via configuration
{"transactionId":"379e5326-7a31-4f68-96d4-ccf1f7e6d26c","systemName":"Unknown Computer","moduleName":"MODULENAME","apiName":"/apiName","userId":"123123123123222","timeStamp":"2019-02-08 09:18:54.129","status":"START"}
{"transactionId":"379e5326-7a31-4f68-96d4-ccf1f7e6d26c","systemName":"Unknown Computer","moduleName":"MODULENAME","apiName":"/apiName","userId":"123123123123222","timeStamp":"2019-02-08 09:18:54.129","status":"END"}

Thanks. I just tested with the samples you provided and was able to get only JSON logs to be indexed. That is, I was able to get non-JSON logs to be excluded.

I got confused by the double negative and had originally made the wrong suggestion. The correct setting you want is this:

include_lines: ['^{']

There should be no exclude_lines setting at all.

Shaunak

Dear Shaunak,

Tried your suggestioninclude_lines: ['^{'] . Also removed the exclude lines. Now also all lines are skipped. The index is not getting created in elasticsearch. The following is the log of filebeat. It seems whenever it encounters non json value it is printing the error.

    2019-02-14T18:02:46.133+0530    INFO    log/harvester.go:254    Harvester started for file: /logs/test/b1.json
    2019-02-14T18:02:46.133+0530    ERROR   json/json.go:51 Error decoding JSON: invalid character 'R' looking for beginning of value
    2019-02-14T18:02:46.133+0530    ERROR   json/json.go:51 Error decoding JSON: invalid character 'F' looking for beginning of value
    2019-02-14T18:02:46.136+0530    ERROR   json/json.go:51 Error decoding JSON: invalid character 'L' looking for beginning of value
    2019-02-14T18:02:46.138+0530    ERROR   json/json.go:51 Error decoding JSON: invalid character 'R' looking for beginning of value
    2019-02-14T18:02:46.143+0530    ERROR   json/json.go:51 Error decoding JSON: invalid character 'R' looking for beginning of value
    2019-02-14T18:02:46.145+0530    ERROR   json/json.go:51 Error decoding JSON: invalid character 'F' looking for beginning of value

Thanks for your time.

Can you post your entire filebeat.yml configuration? Please make sure to mask any sensitive information. Thanks!

Dear Shaunak,

Pasted the filebeat.yml. Please suggest any corrections needed.

filebeat.prospectors:
- type: log
  paths:
    - /logs/test/*.json
  include_lines: ['^{']
  json.message_key: transactionId
  json.keys_under_root: true
  json.overwrite_keys: false
  json.add_error_key: true
filebeat.registry_file: /var/lib/filebeat/registry

output.logstash:
  hosts: ["elasticsearchip:port"]

Dear Shaunak,

Could you please look into my filebeat configuration posted in the last message.

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.