In filebeat processing json log files that does not need processing

Hello Everybody,

Hope somebody could answer my question.

I have an app, that writes logs in json format, that are ALREADY prepared as message for elastic.
one example of what I have in logs:

{"@timestamp":"2023-09-29T07:02:48.361Z","log.level":"info","log.label":null,"log.namespace":"SudreyestrQueue","message":113753266,"client":{"ip":null},"labels":["cid_a457abfc","job_edrsr_sync_docs","entityName__documents.csv"],"meta":{"doc_id":113753266,"cause_num":"554/3112/22","date_publ":"2023-09-29","content_length":198178,"visible_status":0,"stage":"proc_index","status":"ok","description":null},"state":{"entityName":"documents.csv","cid":"a457abfc"}}

what I see,is that filebeat does not correctly passes such logs entries to elastic.
Is this doable at all? or should I have my logging rewritten?

- type: filestream

  # Unique ID among all inputs, an ID is required.
  id: dhimp-imports

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/ldb/log/app.filebeat.dev.log
  json.keys_under_root: true
  json.overwrite_keys: true
  json.add_error_key: true
  json.expand_keys: true 
  parsers:
    - ndjson:
        target: ""
        overwrite_keys: true
        add_error_key: true
        expand_keys: true

setup.template.overwrite: true

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

Can you share an example of this? It is not clear what is the issue as you didn't share what is the output you are getting and what is the expected output.

of course,
one of the message from filebeat:

Sep 29 19:22:04 sho-dhimp01 filebeat[1343909]: {"log.level":"warn","@timestamp":"2023-09-29T19:22:04.608+0300","log.logger":"elasticsearch","log.origin":{"file.name":"elasticsearch/client.go","file.line":429},"message":"Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2023, time.September, 29, 16, 22, 4, 115000000, time.UTC), Meta:null, Fields:{\"agent\":{\"ephemeral_id\":\"3a23dcb0-34af-4f90-8ca0-75443e625f50\",\"id\":\"f72c0c66-9c35-4014-afbe-30bf7fb3844f\",\"name\":\"sho-dhimp01\",\"type\":\"filebeat\",\"version\":\"8.7.1\"},\"client\":{},\"container\":{\"id\":\"log\"},\"ecs\":{\"version\":\"8.0.0\"},\"host\":{\"architecture\":\"x86_64\",\"containerized\":false,\"hostname\":\"sho-dhimp01\",\"id\":\"64695261de8a47819a8e049b8ed02722\",\"ip\":[\"192.168.200.101\",\"fe80::603e:c5ff:feb9:36de\",\"172.17.0.1\",\"fe80::42:48ff:fed4:a05a\",\"172.18.0.1\",\"fe80::42:f1ff:fe0e:2c56\",\"fe80::30f4:7ff:fe62:d215\",\"fe80::49e:64ff:febd:9ef2\"],\"mac\":[\"02-42-48-D4-A0-5A\",\"02-42-F1-0E-2C-56\",\"06-9E-64-BD-9E-F2\",\"32-F4-07-62-D2-15\",\"62-3E-C5-B9-36-DE\"],\"name\":\"sho-dhimp01\",\"os\":{\"codename\":\"focal\",\"family\":\"debian\",\"kernel\":\"5.4.0-139-generic\",\"name\":\"Ubuntu\",\"platform\":\"ubuntu\",\"type\":\"linux\",\"version\":\"20.04.2 LTS (Focal Fossa)\"}},\"input\":{\"type\":\"filestream\"},\"labels\":[\"cid_ff8c0398\",\"job_queue_npas\"],\"log\":{\"file\":{\"path\":\"/var/ldb/nodejs-edrsr-importer/var/log/app.filebeat.dev.log\"},\"level\":\"info\",\"namespace\":\"CliLocker\",\"offset\":6228982453},\"message\":\"Successfully remove lock on stop application\",\"meta\":{},\"state\":{\"cid\":\"ff8c0398\"}}, Private:(*input_logfile.updateOp)(0xc002419470), TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:mapstr.M(nil)}} (status=400): {\"type\":\"document_parsing_exception\",\"reason\":\"[1:197] object mapping for [labels] tried to parse field [null] as object, but found a concrete value\"}, dropping event!","service.name":"filebeat","ecs.version":"1.6.0"}

that's from my systemctl status filebeat

and I expect filebeat to send all messages from /var/ldb/log/app.filebeat.dev.log directly to elasticsearch, as it is already in ELK format.

Sorry, but it is not clear from where is this message from.

Is this from filebeat log? The issue is not clear.

How is your message in your source file: var/ldb/log/app.filebeat.dev.log.

And how is the message in Elasticsearch? In can get it going into Kibana, selecting a document and looking at the json tab to copy and share the message.

Ok, now it makes sense, not sure if I didn't see this or you added it later.

Your messages are being dropped by Elasticsearch because of a mapping conflict:

[1:197] object mapping for [labels] tried to parse field [null] as object, but found a concrete value"}, dropping event!"

What does your filebeat output looks like? You didn't share it.

If you didn't specify an index, it will use the default index/data stream and use the default template.

In the default template, the labels field is an json object, but in your event it is a string field.

You will need to change it to be an object or rename the field.

here is my output section:

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["https://elastic.domain.com:443"]
  
  # Protocol - either `http` (default) or `https`.
  # protocol: 

  # Authentication credentials - either API key or username/password.
  username: "filebeat_dhimp"
  password: "pass"

Yeah, it is using the default index with the default mapping and as explained in the previous answer you have a conflicting mapping.

sorry, but what should i have then?

Filebeat uses a defautl data stream named filebeat-version, this default data stream uses a built-in template with some fields already mapped.

One of these files is the labels field, which need to be an object.

For example:

{ "labels": {"someField": "someValue" }

If your document has a labels field in this format, Elasticsearch will accept, but your document are sending the following labels field:

{ "labels": [ "someValue", "anotherValue" ] } 

In this case, the labels field is a string, which will result in a mapping conflict while trying to index this data, Elasticsearch will reject the document and Filebeat will drop the event, which is what the log you shared is saying.

To solve this you have two options.

Change your source to create the labels fields as some json object, or use a different data stream.

To write your data in another index/data stream you need to follow this documentation, but you will also need to create a template for your new index.

It seems like you're encountering issues with Filebeat not correctly parsing the JSON logs for Elasticsearch. Your configuration appears to be correct for JSON logs, but it's possible that your log format might not align with the expected JSON structure. You may need to adjust your log format or consider custom parsing if necessary to ensure proper indexing into Elasticsearch. AC Football Cases

Another solution would be to set a target field, so your json won't be parsed into the root document and will avoid the mapping conflict.

The issue is not the parsing in this case, it is a mapping conflict on the labels field.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.