Using tshark -e <field> with -T ek switch, parsing json, filebeat

Thanks for allowing me in the forums I'm new to Elasticstack and this is my first post so please be gentle.

Tools:
Windows 10.1804
(Admin)PowerShell
Elastic Version 7.0.1

Problem:
When harvesting a json file generated by using
.\tshark -i eth0 -T ek -e ip.src > somefile.json

  • I get an error with FileBeat:
    ERROR readjson/json.go:52 Error decoding JSON: invalid character 'ÿ' looking for beginning value

  • followed by:
    ERROR readjson/json.go:52 Error decoding JSON: invalid character '\x00' looking for beginning value

the latter occurs in a hanging loop. These are caused by the json.keys_under_root.
Despite the error data is send to Elasticsearch, however that data contains what looks like white space characters between each character e.g.:
"192.168.0.1" becomes ·[·"·1·9·2·.1·6·9·.0·.1·."·]·
"timestamp" becomes ·"·t·i·m·e·s·t·a·m·p·"·

The json file appears to be perfect and is an exact match to when you use:
tshark -i eth0 -T ek > somefile.json

except I'm seeing brackets surround the desired fields to collect which makes me think Elasticsearch may be treating the values as arrays except this doesn't explain the timestamp field. This occurs with json.keys_under_root regardless and is put in a message field that shouldn't exist.

In my filebeat.yml I do use:

input:
-type: log
 json.keys_under_root: true

or I do below to get the other problem:

processors:
 - decode_json_fields:
    ["timestamp"], ["ip_src"]

I'm open to suggestions this seems like a relevant issue as the data is being generated properly except that it appears it may be trying to treat the values as an array. Either getting tshark to not generate the brackets around the fields which would be best, or some way to get Elasticsearch or FileBeat to handle it would be preferable. Ultimately my goal is to utilize
-e <field> switch with the -T ek switch. Anything constructive would be appreciated. It is crucial however that I keep pre or post processing to a minimum through the pipeline. I wold prefer not write a parser to handle this. For the sake of the concept I am trying not to use Logstash at this moment but I'm sure Grok could help with this.

Regards,
dataGuy

Your configuration seems incorrect. (Also, when you are sharing config snippets, please format them using </>.) You don't need to set json.keys_under_root if you are using the decode_json_fields processor.

filebeat.inputs:
- type: log
  enabled: true
processors:
- decode_json_fields:
    target: ""

Why do you have "timestamp" and "ip_src" in your configuration? Do you want to process those fields? Could you please share an example log?

Could you also share your debug logs when reading the input file from the beginning? (./filebeat -e -d "*")

I've removed most of this message as its being ignored and I won't be monitoring it anymore, if you would like to continue let me know @kvch

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.