In filebeat processing json log files that does not need processing

gred7 · September 29, 2023, 3:31pm

Hello Everybody,

Hope somebody could answer my question.

I have an app, that writes logs in json format, that are ALREADY prepared as message for elastic.
one example of what I have in logs:

{"@timestamp":"2023-09-29T07:02:48.361Z","log.level":"info","log.label":null,"log.namespace":"SudreyestrQueue","message":113753266,"client":{"ip":null},"labels":["cid_a457abfc","job_edrsr_sync_docs","entityName__documents.csv"],"meta":{"doc_id":113753266,"cause_num":"554/3112/22","date_publ":"2023-09-29","content_length":198178,"visible_status":0,"stage":"proc_index","status":"ok","description":null},"state":{"entityName":"documents.csv","cid":"a457abfc"}}

what I see,is that filebeat does not correctly passes such logs entries to elastic.
Is this doable at all? or should I have my logging rewritten?

- type: filestream

  # Unique ID among all inputs, an ID is required.
  id: dhimp-imports

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/ldb/log/app.filebeat.dev.log
  json.keys_under_root: true
  json.overwrite_keys: true
  json.add_error_key: true
  json.expand_keys: true 
  parsers:
    - ndjson:
        target: ""
        overwrite_keys: true
        add_error_key: true
        expand_keys: true

setup.template.overwrite: true

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

leandrojmp · September 29, 2023, 3:44pm

Can you share an example of this? It is not clear what is the issue as you didn't share what is the output you are getting and what is the expected output.

gred7 · September 29, 2023, 4:25pm

of course,
one of the message from filebeat:

Sep 29 19:22:04 sho-dhimp01 filebeat[1343909]: {"log.level":"warn","@timestamp":"2023-09-29T19:22:04.608+0300","log.logger":"elasticsearch","log.origin":{"file.name":"elasticsearch/client.go","file.line":429},"message":"Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2023, time.September, 29, 16, 22, 4, 115000000, time.UTC), Meta:null, Fields:{\"agent\":{\"ephemeral_id\":\"3a23dcb0-34af-4f90-8ca0-75443e625f50\",\"id\":\"f72c0c66-9c35-4014-afbe-30bf7fb3844f\",\"name\":\"sho-dhimp01\",\"type\":\"filebeat\",\"version\":\"8.7.1\"},\"client\":{},\"container\":{\"id\":\"log\"},\"ecs\":{\"version\":\"8.0.0\"},\"host\":{\"architecture\":\"x86_64\",\"containerized\":false,\"hostname\":\"sho-dhimp01\",\"id\":\"64695261de8a47819a8e049b8ed02722\",\"ip\":[\"192.168.200.101\",\"fe80::603e:c5ff:feb9:36de\",\"172.17.0.1\",\"fe80::42:48ff:fed4:a05a\",\"172.18.0.1\",\"fe80::42:f1ff:fe0e:2c56\",\"fe80::30f4:7ff:fe62:d215\",\"fe80::49e:64ff:febd:9ef2\"],\"mac\":[\"02-42-48-D4-A0-5A\",\"02-42-F1-0E-2C-56\",\"06-9E-64-BD-9E-F2\",\"32-F4-07-62-D2-15\",\"62-3E-C5-B9-36-DE\"],\"name\":\"sho-dhimp01\",\"os\":{\"codename\":\"focal\",\"family\":\"debian\",\"kernel\":\"5.4.0-139-generic\",\"name\":\"Ubuntu\",\"platform\":\"ubuntu\",\"type\":\"linux\",\"version\":\"20.04.2 LTS (Focal Fossa)\"}},\"input\":{\"type\":\"filestream\"},\"labels\":[\"cid_ff8c0398\",\"job_queue_npas\"],\"log\":{\"file\":{\"path\":\"/var/ldb/nodejs-edrsr-importer/var/log/app.filebeat.dev.log\"},\"level\":\"info\",\"namespace\":\"CliLocker\",\"offset\":6228982453},\"message\":\"Successfully remove lock on stop application\",\"meta\":{},\"state\":{\"cid\":\"ff8c0398\"}}, Private:(*input_logfile.updateOp)(0xc002419470), TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:mapstr.M(nil)}} (status=400): {\"type\":\"document_parsing_exception\",\"reason\":\"[1:197] object mapping for [labels] tried to parse field [null] as object, but found a concrete value\"}, dropping event!","service.name":"filebeat","ecs.version":"1.6.0"}

that's from my systemctl status filebeat

and I expect filebeat to send all messages from /var/ldb/log/app.filebeat.dev.log directly to elasticsearch, as it is already in ELK format.

leandrojmp · September 29, 2023, 5:25pm

Sorry, but it is not clear from where is this message from.

Is this from filebeat log? The issue is not clear.

How is your message in your source file: var/ldb/log/app.filebeat.dev.log.

And how is the message in Elasticsearch? In can get it going into Kibana, selecting a document and looking at the json tab to copy and share the message.

leandrojmp · September 29, 2023, 5:37pm

Ok, now it makes sense, not sure if I didn't see this or you added it later.

Your messages are being dropped by Elasticsearch because of a mapping conflict:

[1:197] object mapping for [labels] tried to parse field [null] as object, but found a concrete value"}, dropping event!"

What does your filebeat output looks like? You didn't share it.

If you didn't specify an index, it will use the default index/data stream and use the default template.

In the default template, the labels field is an json object, but in your event it is a string field.

You will need to change it to be an object or rename the field.

gred7 · September 29, 2023, 6:57pm

here is my output section:

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["https://elastic.domain.com:443"]
  
  # Protocol - either `http` (default) or `https`.
  # protocol: 

  # Authentication credentials - either API key or username/password.
  username: "filebeat_dhimp"
  password: "pass"

leandrojmp · September 29, 2023, 7:05pm

Yeah, it is using the default index with the default mapping and as explained in the previous answer you have a conflicting mapping.

gred7 · September 29, 2023, 8:38pm

sorry, but what should i have then?

leandrojmp · September 29, 2023, 10:06pm

Filebeat uses a defautl data stream named filebeat-version, this default data stream uses a built-in template with some fields already mapped.

One of these files is the labels field, which need to be an object.

For example:

{ "labels": {"someField": "someValue" }

If your document has a labels field in this format, Elasticsearch will accept, but your document are sending the following labels field:

{ "labels": [ "someValue", "anotherValue" ] }

In this case, the labels field is a string, which will result in a mapping conflict while trying to index this data, Elasticsearch will reject the document and Filebeat will drop the event, which is what the log you shared is saying.

To solve this you have two options.

Change your source to create the labels fields as some json object, or use a different data stream.

To write your data in another index/data stream you need to follow this documentation, but you will also need to create a template for your new index.

Andrew_Mora · September 30, 2023, 12:46pm

gred7:

Hello Everybody,

Hope somebody could answer my question.

I have an app, that writes logs in json format, that are ALREADY prepared as message for elastic.
one example of what I have in logs:

{"@timestamp":"2023-09-29T07:02:48.361Z","log.level":"info","log.label":null,"log.namespace":"SudreyestrQueue","message":113753266,"client":{"ip":null},"labels":["cid_a457abfc","job_edrsr_sync_docs","entityName__documents.csv"],"meta":{"doc_id":113753266,"cause_num":"554/3112/22","date_publ":"2023-09-29","content_length":198178,"visible_status":0,"stage":"proc_index","status":"ok","description":null},"state":{"entityName":"documents.csv","cid":"a457abfc"}}

what I see,is that filebeat does not correctly passes such logs entries to elastic.
Is this doable at all? or should I have my logging rewritten?

- type: filestream

  # Unique ID among all inputs, an ID is required.
  id: dhimp-imports

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/ldb/log/app.filebeat.dev.log
  json.keys_under_root: true
  json.overwrite_keys: true
  json.add_error_key: true
  json.expand_keys: true 
  parsers:
    - ndjson:
        target: ""
        overwrite_keys: true
        add_error_key: true
        expand_keys: true

setup.template.overwrite: true

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

It seems like you're encountering issues with Filebeat not correctly parsing the JSON logs for Elasticsearch. Your configuration appears to be correct for JSON logs, but it's possible that your log format might not align with the expected JSON structure. You may need to adjust your log format or consider custom parsing if necessary to ensure proper indexing into Elasticsearch. AC Football Cases

leandrojmp · September 30, 2023, 1:28pm

Another solution would be to set a target field, so your json won't be parsed into the root document and will avoid the mapping conflict.

The issue is not the parsing in this case, it is a mapping conflict on the labels field.

system · October 28, 2023, 1:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat uses process time instead event time Beats filebeat	2	495	June 18, 2021
Problem with JSON logs Beats filebeat	3	564	March 9, 2019
Filebeat send a json log but is stored as a string Beats filebeat	17	12747	July 5, 2017
I want to send json formated logs via filebeat to elasticsearch Beats filebeat	2	297	October 16, 2020
How to send Json logs to Elastic Search using File Beats without extra fields Beats filebeat	5	4083	August 6, 2020

In filebeat processing json log files that does not need processing

Related topics