Importing metricbeat events from json-file to elastic using filebeat

Hi,

I have a case when metricbeat can't deliver messages directly to elastic, instead, it writes JSON-style events to a file and later filebeat deliver it to elastic. But unfortunately I can’t use "json.keys_under_root" in filebeat if json already contains "@metadata" fields. Filebeat will crash with error:

2019-12-18T23:42:33.354+0200 DEBUG [publish] pipeline/client.go:193 Pipeline client receives callback 'onFilteredOut' for event: %+v{0001-01-01 00:00:00 +0000 UTC null null { true 0xc42054c750 /tmp/metricbeat 613 2019-12-18 23:42:33.350883336 +0200 EET m=+0.026620145 -1ns log map 1483843-2050}}

Steps to reproduce:

Create json file with system metrics inside:

metricbeat.modules:
- module: system
  period: 30s
  metricsets:
    - cpu
    - load
    - memory
    - network
    - process
    - process_summary
    - diskio
- module: system
  period: 1m
  metricsets:
    - filesystem
    - fsstat
  processors:
  - drop_event.when.regexp:
      system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
output.file:
  path: "/tmp"
  filename: metricbeat.json

After that try to deliver this file to ELK by filebeat with next config:

filebeat.inputs:
- type: log
  paths: ["/tmp/metricbeat.json"]
  json.keys_under_root: true
  json.overwrite_keys: true
output.logstash:
  hosts: ["my.server.com:5555"]
logging.level: debug

Filebeat refuses to process JSON if it already contains "@metadata" (with beat, type and version fields). Is any workaround for it? I already tried to use "processors" to remove this fields on both side (metricbeat and filebeat) but looks you can't remove system field.

Hi @dimuskin,

This seems to be a known issue: https://github.com/elastic/beats/issues/6381 But I am afraid that I don't know of any workaround :frowning:

As you are also using logstash one thing you could try is to remove the json options from filebeat and do the JSON parsing in logstash.

Thanks for the detailed report!

@jsoriano thank you for fast response, I made little workaround:

  1. disabled "json.keys_under_root" on filebeat side, what means JSON is placed under a "json" key in the output document.

  2. added additional pipeline in logstash

         #  Ruby can't process @timestamp
         date {
             match => [ "[json][@timestamp]", "ISO8601" ]
             timezone => "Etc/UTC"
             remove_field => "[json][@timestamp]"
         }
         ruby {
             code => '
                 event.get("json").each { |k, v|
                     event.set(k,v)
                 }
                 event.remove("json")
             '
         }
    

but this is a terrible solution, due to which performance drops drastically and requires logstash layer between filebeat and elastic.

it would be nice to get around this problem :slight_smile:

@dimuskin if you want to remove logstash from the equation you can also use ingest pipelines in Elasticsearch, there you can also use a JSON processor, so you could send the raw logs with filebeat and do the JSON parsing in Elasticsearch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.