How do I output metadata when exporting data from elasticsearch with logstash?

I am using the following pipeline in logstash:

    - pipeline.id: export-process
      pipeline.workers: 4
      config.string: |
        input {
          elasticsearch {
            hosts => "http://localhost:9200"
            ssl => "false"
            index => "metricbeat-*"
            docinfo => true
            }
            query => '{
                "query": {
                  "bool": {
                    "filter": {
                      "range": {
                          "@timestamp": {
                          "gte": "now-30m",
                          "lte": "now",
                          "format": "strict_date_optional_time||epoch_millis"
                          }
                      }
                    }
                  }
              }
            }'            
          }
        }

        output {
          file {
            gzip => "true"
            path => "/usr/share/logstash/export/export_%{[@metadata][_index]}.json.gz"
          }
        }  

This works as expected, however each line of the output comes out like this:

{
  "@version": "1",
  "host": {
    "name": "monitoring-fb-beat-filebeat-2h7t4"
  },
  "agent": {
    "hostname": "monitoring-fb-beat-filebeat-2h7t4",
    "name": "monitoring-fb-beat-filebeat-2h7t4",
    "id": "44f948e3-5eb6-4afa-95b5-53b39bc661df",
    "type": "filebeat",
    "ephemeral_id": "6ad248a8-028d-4914-9a54-65b8e9222ace",
    "version": "7.17.7"
  },
  "log": {
    "offset": 290242,
    "file": {
      "path": "/var/log/containers/monitoring-mb-beat-metricbeat-gchp5_monitoring_metricbeat-b63cb08adec937a56e887472e2a308839203005eb4f8ea974c284100babf9c4f.log"
    }
  },
  "kubernetes": {
    "namespace_labels": {
      "kubernetes_io/metadata_name": "monitoring"
    },
    "namespace": "monitoring",
    "node": {
      "labels": {
        "topology_kubernetes_io/zone": "eu-west-1b",
        "node_kubernetes_io/instance-type": "m6i.12xlarge",
        "beta_kubernetes_io/instance-type": "m6i.12xlarge",
        "k8s_io/cloud-provider-aws": "d40d2d858cc11fa6d9e2ec8bccacd30f",
        "topology_ebs_csi_aws_com/zone": "eu-west-1b",
        "beta_kubernetes_io/os": "linux",
        "kubernetes_io/os": "linux",
        "kubernetes_io/arch": "amd64",
        "beta_kubernetes_io/arch": "amd64",
      },
      "uid": "adb67c03-193f-4430-8bdc-071bcdac0826"
    },
    "daemonset": {
      "name": "monitoring-mb-beat-metricbeat"
    },
    "labels": {
      "controller-revision-hash": "58b65ddf9b",
      "chart": "monitoring-4.3.0",
      "common_k8s_elastic_co/type": "beat",
      "heritage": "Helm",
      "beat_k8s_elastic_co/name": "monitoring-mb",
      "stack-monitoring_elastic_co/type": "beat",
      "app": "monitoring-metricbeat",
      "release": "monitoring",
      "beat_k8s_elastic_co/version": "7.17.7",
      "pod-template-generation": "1"
    },
    "namespace_uid": "6beaefed-9f0e-4367-9dfb-2b2598db6f32",
    "container": {
      "name": "metricbeat"
    },
    "pod": {
      "name": "monitoring-mb-beat-metricbeat-gchp5",
      "uid": "3bc401f3-cf13-4a60-9a66-b25384fdd575"
    }
  },
  "input": {
    "type": "container"
  },
  "container": {
    "runtime": "docker",
    "image": {
      "name": "10.2.1.2:5000/beats/metricbeat:7.17.7"
    },
    "id": "b63cb08adec937a56e887472e2a308839203005eb4f8ea974c284100babf9c4f"
  },
  "@timestamp": "2023-03-13T08:04:17.013Z",
  "message": "2023-03-13T08:04:17.013Z\tWARN\t[transport]\ttransport/tcp.go:52\tDNS lookup failure \"odf-cluster-zookeeper-metrics.odf\": lookup odf-cluster-zookeeper-metrics.odf on 172.20.0.10:53: no such host",
  "stream": "stderr",
  "ecs": {
    "version": "1.12.0"
  }
}

This however, has no metadata. It doesn't contain which index it came from etc, fields like:

        "_index" : "metricbeat-12-22-23",
        "_type" : "_doc",
        "_id" : "gE6t4IYB7RMHyt2kHhRw",
        "_score" : 1.0,

So, when I try and import the same json file into another elastic instance, it doesn't get indexed properly, it just goes in as a blob, with a bunch of fields all bundled under the same object. From reading the documentation I need to add the following:

            docinfo_target => "[@metadata][doc]"
              add_field => {
                identifier => "%{[@metadata][doc][_index]}:%{[@metadata][doc][_type]}:%{[@metadata][doc][_id]}"
            }

to the input section. This just the breaks export though.. and the output file is corrupted. What is the correct way to do this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.