I'm using Filebeat 7.4.2 in order to read files from a directory. Each file contains valid json data line by line. Each json row within the file is read from filebeat then sent to a logstash (v7.4.2) instance that will send data to Elasticsearch (v7.4.2).
It comes that logstash prints every how an error log like the following:
[2019-12-05T09:05:16,378][WARN][logstash.filters.json][main] Error parsing json {:source=>"message", :raw=>"/www.super-bike.ch/it/caschi-modulari/21111-165734-ls2-ff313-vortex-solid-matt-carbon.html#/2-taglia-s?SubmitCurrency=1&id_currency=1\",\"image_path_locale\":\"wasabi_dragon/images_normalized/9430/73be873540f797cf0855ccb2c03b23575f46762df5e9b0cfb7b0f832fd2bd1a8.jpg\",\"image_path_cdn\":\"wasabi_dragon/images/9430/73be873540f797cf0855ccb2c03b23575f46762df5e9b0cfb7b0f832fd2bd1a8.jpg\"}", :exception=>#<LogStash::Json::ParserError: Unexpected character ('/' (code 47)): maybe a (non-standard) comment? (not recognized as one since Feature 'ALLOW_COMMENTS' not enabled for parser)
It seems that the json is not valid because it has been truncated somewhere from someone.
Investigating, I found the original file that contains the line that causes the error. The file contains these two rows:
{"url":"https://www.demon-tweeks.com/uk/mishimoto-weighted-shift-knob-1147808/","image_path_locale":"wasabi_dragon/images_normalized/330763/f05af6261cf450c191e8740c677007e41d290d162eec3bf25ca0ce607b149603.jpg","image_path_cdn":"wasabi_dragon/images/330763/f05af6261cf450c191e8740c677007e41d290d162eec3bf25ca0ce607b149603.jpg"}
{"url":"https://www.super-bike.ch/it/caschi-modulari/21111-165734-ls2-ff313-vortex-solid-matt-carbon.html#/2-taglia-s?SubmitCurrency=1&id_currency=1","image_path_locale":"wasabi_dragon/images_normalized/9430/73be873540f797cf0855ccb2c03b23575f46762df5e9b0cfb7b0f832fd2bd1a8.jpg","image_path_cdn":"wasabi_dragon/images/9430/73be873540f797cf0855ccb2c03b23575f46762df5e9b0cfb7b0f832fd2bd1a8.jpg"}
Those rows are well formatted and the strange part is that if now I rename the file and put it in the Filebeat directory everything will work perfectly and data will reach Elasticsearch.
I have enabled debug mode for Filebeat and I found that the published message is wrong:
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: 2019-12-05T09:05:15.335+0100#011DEBUG#011[processors]#011processing/processors.go:183#011Publish event: {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "@timestamp": "2019-12-05T08:05:15.335Z",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "@metadata": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "beat": "filebeat",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "type": "_doc",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "version": "7.4.2"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: },
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "message": "/www.super-bike.ch/it/caschi-modulari/21111-165734-ls2-ff313-vortex-solid-matt-carbon.html#/2-taglia-s?SubmitCurrency=1&id_currency=1\",\"image_path_locale\":\"wasabi_dragon/images_normalized/9430/73be873540f797cf0855ccb2c03b23575f46762df5e9b0cfb7b0f832fd2bd1a8.jpg\",\"image_path_cdn\":\"wasabi_dragon/images/9430/73be873540f797cf0855ccb2c03b23575f46762df5e9b0cfb7b0f832fd2bd1a8.jpg\"}",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "log": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "offset": 367,
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "file": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "path": "/path/to/json/2019_12_5_9_5_25624.json"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: }
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: },
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "fields": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "log_type": "stalker"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: },
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "input": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "type": "log"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: },
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "ecs": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "version": "1.1.0"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: },
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "host": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "name": "ES-NVMe1"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: },
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "agent": {
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "id": "651bec87-4a0e-4edb-b5e3-795d8fd54031",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "version": "7.4.2",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "type": "filebeat",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "ephemeral_id": "92a2cfde-bb87-4cce-8ae6-8642e183b427",
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: "hostname": "ES-NVMe1"
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: }
Dec 5 09:05:15 ES-NVMe1 filebeat[1327]: }
So there is something wrong with Filebeat but it is not deterministic.
Here the filebeat.yml
:
filebeat.inputs:
- type: log
close_eof: true
paths:
- /path/to/json/*
fields: {log_type: stalker}
output.logstash:
hosts: ["127.0.0.1:5043"]
logging.level: debug
logging.to_files: true
logging.to_syslog: false
logging.files:
path: /var/log/filebeat
name: mybeat.log
keepfiles: 7
I cannot reproduce the error, it happens every how and I don't know why.
Do you have any idea that helps me?
Any other test to do?
Thanks
Marco