Filebeat splits message after 16k

paltryeffort · March 13, 2018, 10:43am

# filebeat 6.2.2
filebeat.prospectors:
-
  paths: [ "/var/lib/docker/containers/*/*-json.log" ]
  harvester_buffer_size: 65536
  json.message_key: log
  json.keys_under_root: true
  json.add_error_key: true
  fields_under_root: true
  processors:
    - add_docker_metadata: ~

harvester_buffer_size: 65536

output.file:
  path: "/tmp/filebeat"
  filename: filebeat

The docker container outputs a single line in valid json (about 18k)
The value of the "log" message key is also a single line in valid json
The message seems to be cut off at about 16k or a bit above (depends if you count the backslashes for escaping)
A second message gets created with the remaining part of the message including full decoration (docker meta data, additional fields etc)
Looks like filebeat splits the message into 2 separate ones
harvester_buffer_size has no effect
removing the json.* options has no effect

exekias · March 13, 2018, 11:12am

Hi @paltryeffort,

We introduced the docker prospector, it handles JSON decoding and timestamp retrieval for you.

Could you confirm if this is still happening with it? Conf would look like this:

filebeat.prospectors:
- type: docker
  containers.ids:
    - "*"
  processors:
    - add_docker_metadata: ~

paltryeffort · March 13, 2018, 11:50am

Hi @exekias ,
thanks for the fast response.
Using your config the message gets cut off at the exact same position.
I tried it with and without the harvester_buffer_size option.

jsamphillips · March 15, 2018, 2:51pm

We're having this exact same issue. Nothing I have tweaked seems to change this behavior. Interested to see what the solution is

paltryeffort · March 15, 2018, 4:32pm

How to reproduce:
use my config, start a container and run the following:

#!/bin/bash
key="somerandomkey_"
value="somerandomvalue_"
echo -n '{'
for i in $(seq 420); do
  echo -n "\"${key}${i}\":\"${value}${i}\","
done
echo '"lastkey":"end"}'

This produces a valid json output. For me it cuts off at "somerand" and a new message continues with "omkey_396".

jsamphillips · March 20, 2018, 6:25pm

Bump on this. @exekias is there anything else we can try to do to resolve this issue?

bitsofinfo · March 20, 2018, 7:24pm

issued added here, please comment on it with your experiences to get some traction going, its definitely a deal breaker!

ruflin · March 22, 2018, 1:48pm

Thanks for opening the issue. We will try to find some time to look into it.

bitsofinfo · March 22, 2018, 2:12pm

Pretty urgent for us, may have to use else due to this, for example apps that generate stack traces or if you are consuming modsecurity audit logs, sizes of entries like this are pretty typical and w/ fbeats we are not sure how to re-correlate this data in logstash?

any suggestions from elastic for us and others dealing with this?

ruflin · March 26, 2018, 9:01am

For everyone coming to this thread, discussion continues in https://github.com/elastic/beats/issues/6605

system · April 23, 2018, 9:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat harvester cannot collect logs across files Beats docker , filebeat	1	339	June 26, 2020
Filebeat breaking up docker log lines Beats filebeat	3	851	September 26, 2018
Filebeat inptu container rotation json parser error Beats docker , filebeat	6	595	June 23, 2020
Filebeat decoding Docker logs error Beats filebeat	5	1048	July 21, 2018
Filebeat lost lines from docker json log file Beats docker , filebeat	1	407	July 24, 2020

Filebeat splits message after 16k

Related topics