Architecturally should filebeat handle nested json parsing or should it be moved to logstash?

surajs · August 31, 2016, 4:55am

We're trying to parse docker json file logger produces logs in the following format :
{"log":"{"msg":"Event started", "ts":"2016-08-30:01:02:03.000"}"}

If you have 1000s of docker containers all writing logs in a similar (but not exactly same) format, would it be sound to implement the unmarshalling of string-ified json and converting it to json object in file-beat? So that logstash configuration can be as simplified as possible

Or should it be something like this be implemented in logstash?

magnusbaeck · August 31, 2016, 6:51am

At some point we have to stop with the feature bloat of Filebeat. Parsing the first level of JSON is fine, but IMHO supporting nested JSON is too much.

steffens · August 31, 2016, 10:44am

filebeat is mostly a shipper. If you need more customized processing, logstash is the way to go.

Personally I'm no big fan of docker based logging, especially json file logger. Problems with docker logging are: by default json log file grows infintely (not bounded), log file is delete if container get's deleted (did you forward all logs yet?), json in json(?), how about multiline json in json(?), all logs are captured and forworded through workers in docker daemon itself (do not start hundreds of containers, or watch memory usage!).

surajs · August 31, 2016, 5:15pm

@steffens , @magnusbaeck ,

Fortunately or unfortunately , depending on how you see it we already use docker based json-logging and changing that is going to require some big operational overhead Also, with regards to keeping filebeat feature light, i see your concern in the fact that you don't want to have logstash-like functionalities implemented in the shipper itself. However, since docker's json-file format, which wraps json as strings (something that I'm not extremely fond of), is fairly common, i wouldn't completely rule out the shipper owning some nested json parsing.

Given that we already have logs from ~200 different services being pushed into ES
implementing this json unwrapping in filebeat is a trade-off that we may have to take in order to avoid changing all the logstash filters.

Shall keep you posted on the approach i take for this.

steffens · September 1, 2016, 11:19am

btw. elasticsearch 5.0 gains support for ingest node (subset of logstash filters). While json filter is not yet implemented, you might want to watch this ticket.

system · September 21, 2016, 4:56am

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Filebeat lost lines from docker json log file Beats docker , filebeat	1	393	July 24, 2020
Docker container logs and filebeat Beats	2	3346	July 5, 2017
Filebeat process with `logging.json = true` does not emit pure JSON to stdout Beats filebeat	7	1186	February 19, 2018
Filebeat-oss 7.5.2 not parsing string as json Beats filebeat	3	1390	February 22, 2020
Unable to parse docker json-file Beats filebeat	8	3437	February 1, 2018

Architecturally should filebeat handle nested json parsing or should it be moved to logstash?

Related topics