I am using the following pipeline in logstash:
- pipeline.id: export-process
pipeline.workers: 4
config.string: |
input {
elasticsearch {
hosts => "http://localhost:9200"
ssl => "false"
index => "metricbeat-*"
docinfo => true
}
query => '{
"query": {
"bool": {
"filter": {
"range": {
"@timestamp": {
"gte": "now-30m",
"lte": "now",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
}'
}
}
output {
file {
gzip => "true"
path => "/usr/share/logstash/export/export_%{[@metadata][_index]}.json.gz"
}
}
This works as expected, however each line of the output comes out like this:
{
"@version": "1",
"host": {
"name": "monitoring-fb-beat-filebeat-2h7t4"
},
"agent": {
"hostname": "monitoring-fb-beat-filebeat-2h7t4",
"name": "monitoring-fb-beat-filebeat-2h7t4",
"id": "44f948e3-5eb6-4afa-95b5-53b39bc661df",
"type": "filebeat",
"ephemeral_id": "6ad248a8-028d-4914-9a54-65b8e9222ace",
"version": "7.17.7"
},
"log": {
"offset": 290242,
"file": {
"path": "/var/log/containers/monitoring-mb-beat-metricbeat-gchp5_monitoring_metricbeat-b63cb08adec937a56e887472e2a308839203005eb4f8ea974c284100babf9c4f.log"
}
},
"kubernetes": {
"namespace_labels": {
"kubernetes_io/metadata_name": "monitoring"
},
"namespace": "monitoring",
"node": {
"labels": {
"topology_kubernetes_io/zone": "eu-west-1b",
"node_kubernetes_io/instance-type": "m6i.12xlarge",
"beta_kubernetes_io/instance-type": "m6i.12xlarge",
"k8s_io/cloud-provider-aws": "d40d2d858cc11fa6d9e2ec8bccacd30f",
"topology_ebs_csi_aws_com/zone": "eu-west-1b",
"beta_kubernetes_io/os": "linux",
"kubernetes_io/os": "linux",
"kubernetes_io/arch": "amd64",
"beta_kubernetes_io/arch": "amd64",
},
"uid": "adb67c03-193f-4430-8bdc-071bcdac0826"
},
"daemonset": {
"name": "monitoring-mb-beat-metricbeat"
},
"labels": {
"controller-revision-hash": "58b65ddf9b",
"chart": "monitoring-4.3.0",
"common_k8s_elastic_co/type": "beat",
"heritage": "Helm",
"beat_k8s_elastic_co/name": "monitoring-mb",
"stack-monitoring_elastic_co/type": "beat",
"app": "monitoring-metricbeat",
"release": "monitoring",
"beat_k8s_elastic_co/version": "7.17.7",
"pod-template-generation": "1"
},
"namespace_uid": "6beaefed-9f0e-4367-9dfb-2b2598db6f32",
"container": {
"name": "metricbeat"
},
"pod": {
"name": "monitoring-mb-beat-metricbeat-gchp5",
"uid": "3bc401f3-cf13-4a60-9a66-b25384fdd575"
}
},
"input": {
"type": "container"
},
"container": {
"runtime": "docker",
"image": {
"name": "10.2.1.2:5000/beats/metricbeat:7.17.7"
},
"id": "b63cb08adec937a56e887472e2a308839203005eb4f8ea974c284100babf9c4f"
},
"@timestamp": "2023-03-13T08:04:17.013Z",
"message": "2023-03-13T08:04:17.013Z\tWARN\t[transport]\ttransport/tcp.go:52\tDNS lookup failure \"odf-cluster-zookeeper-metrics.odf\": lookup odf-cluster-zookeeper-metrics.odf on 172.20.0.10:53: no such host",
"stream": "stderr",
"ecs": {
"version": "1.12.0"
}
}
This however, has no metadata. It doesn't contain which index it came from etc, fields like:
"_index" : "metricbeat-12-22-23",
"_type" : "_doc",
"_id" : "gE6t4IYB7RMHyt2kHhRw",
"_score" : 1.0,
So, when I try and import the same json file into another elastic instance, it doesn't get indexed properly, it just goes in as a blob, with a bunch of fields all bundled under the same object. From reading the documentation I need to add the following:
docinfo_target => "[@metadata][doc]"
add_field => {
identifier => "%{[@metadata][doc][_index]}:%{[@metadata][doc][_type]}:%{[@metadata][doc][_id]}"
}
to the input section. This just the breaks export though.. and the output file is corrupted. What is the correct way to do this?