Using tshark -e <field> with -T ek switch, parsing json, filebeat

dataGuy · May 30, 2019, 11:12pm

Thanks for allowing me in the forums I'm new to Elasticstack and this is my first post so please be gentle.

Tools:
Windows 10.1804
(Admin)PowerShell
Elastic Version 7.0.1

Problem:
When harvesting a json file generated by using
.\tshark -i eth0 -T ek -e ip.src > somefile.json

I get an error with FileBeat:
ERROR readjson/json.go:52 Error decoding JSON: invalid character 'ÿ' looking for beginning value
followed by:
ERROR readjson/json.go:52 Error decoding JSON: invalid character '\x00' looking for beginning value

the latter occurs in a hanging loop. These are caused by the json.keys_under_root.
Despite the error data is send to Elasticsearch, however that data contains what looks like white space characters between each character e.g.:
"192.168.0.1" becomes ·[·"·1·9·2·.1·6·9·.0·.1·."·]· "timestamp" becomes ·"·t·i·m·e·s·t·a·m·p·"·

The json file appears to be perfect and is an exact match to when you use:
tshark -i eth0 -T ek > somefile.json

except I'm seeing brackets surround the desired fields to collect which makes me think Elasticsearch may be treating the values as arrays except this doesn't explain the timestamp field. This occurs with json.keys_under_root regardless and is put in a message field that shouldn't exist.

In my filebeat.yml I do use:
input: -type: log json.keys_under_root: true

or I do below to get the other problem:

processors:
- decode_json_fields:
["timestamp"], ["ip_src"]

I'm open to suggestions this seems like a relevant issue as the data is being generated properly except that it appears it may be trying to treat the values as an array. Either getting tshark to not generate the brackets around the fields which would be best, or some way to get Elasticsearch or FileBeat to handle it would be preferable. Ultimately my goal is to utilize
-e <field> switch with the -T ek switch. Anything constructive would be appreciated. It is crucial however that I keep pre or post processing to a minimum through the pipeline. I wold prefer not write a parser to handle this. For the sake of the concept I am trying not to use Logstash at this moment but I'm sure Grok could help with this.

Regards,
dataGuy

kvch · June 3, 2019, 7:55am

Your configuration seems incorrect. (Also, when you are sharing config snippets, please format them using </>.) You don't need to set json.keys_under_root if you are using the decode_json_fields processor.

filebeat.inputs:
- type: log
  enabled: true
processors:
- decode_json_fields:
    target: ""

Why do you have "timestamp" and "ip_src" in your configuration? Do you want to process those fields? Could you please share an example log?

kvch · June 5, 2019, 9:28am

Could you also share your debug logs when reading the input file from the beginning? (./filebeat -e -d "*")

dataGuy · June 11, 2019, 2:50am

I've removed most of this message as its being ignored and I won't be monitoring it anymore, if you would like to continue let me know @kvch

system · July 9, 2019, 2:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Fields not being mapped, Json decode error on event Elasticsearch	1	584	July 12, 2019
Parse JSON data with filebeat Beats filebeat	8	60911	April 24, 2017
Filebeat to logstash problem to parse json message Beats filebeat	7	1964	January 10, 2018
Filebeat JSON message Beats	3	5179	November 24, 2017
Read JSON fields with filebeat Beats filebeat	3	442	June 2, 2019

Using tshark -e <field> with -T ek switch, parsing json, filebeat

Related topics