Importing metricbeat events from json-file to elastic using filebeat

Hi,

I have a case when metricbeat can't deliver messages directly to elastic, instead, it writes JSON-style events to a file and later filebeat deliver it to elastic. But unfortunately I can’t use "json.keys_under_root" in filebeat if json already contains "@metadata" fields. Filebeat will crash with error:

2019-12-18T23:42:33.354+0200 DEBUG [publish] pipeline/client.go:193 Pipeline client receives callback 'onFilteredOut' for event: %+v{0001-01-01 00:00:00 +0000 UTC null null { true 0xc42054c750 /tmp/metricbeat 613 2019-12-18 23:42:33.350883336 +0200 EET m=+0.026620145 -1ns log map 1483843-2050}}

Steps to reproduce:

Create json file with system metrics inside:

metricbeat.modules:
- module: system
  period: 30s
  metricsets:
    - cpu
    - load
    - memory
    - network
    - process
    - process_summary
    - diskio
- module: system
  period: 1m
  metricsets:
    - filesystem
    - fsstat
  processors:
  - drop_event.when.regexp:
      system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
output.file:
  path: "/tmp"
  filename: metricbeat.json

After that try to deliver this file to ELK by filebeat with next config:

filebeat.inputs:
- type: log
  paths: ["/tmp/metricbeat.json"]
  json.keys_under_root: true
  json.overwrite_keys: true
output.logstash:
  hosts: ["my.server.com:5555"]
logging.level: debug

Filebeat refuses to process JSON if it already contains "@metadata" (with beat, type and version fields). Is any workaround for it? I already tried to use "processors" to remove this fields on both side (metricbeat and filebeat) but looks you can't remove system field.

Hi @dimuskin,

This seems to be a known issue: https://github.com/elastic/beats/issues/6381 But I am afraid that I don't know of any workaround :frowning:

As you are also using logstash one thing you could try is to remove the json options from filebeat and do the JSON parsing in logstash.

Thanks for the detailed report!

@jsoriano thank you for fast response, I made little workaround:

  1. disabled "json.keys_under_root" on filebeat side, what means JSON is placed under a "json" key in the output document.

  2. added additional pipeline in logstash

         #  Ruby can't process @timestamp
         date {
             match => [ "[json][@timestamp]", "ISO8601" ]
             timezone => "Etc/UTC"
             remove_field => "[json][@timestamp]"
         }
         ruby {
             code => '
                 event.get("json").each { |k, v|
                     event.set(k,v)
                 }
                 event.remove("json")
             '
         }
    

but this is a terrible solution, due to which performance drops drastically and requires logstash layer between filebeat and elastic.

it would be nice to get around this problem :slight_smile:

@dimuskin if you want to remove logstash from the equation you can also use ingest pipelines in Elasticsearch, there you can also use a JSON processor, so you could send the raw logs with filebeat and do the JSON parsing in Elasticsearch.