I have experienced a lost data issue with filebeat ----> elasticsearch.
when one pod of elasticsearch cluster is down, i found that the filebeat had lost log data.
The related configuration is as follows :
elasticsearch version : 6.1.2
filebeat version: 6.1.1
filebeat.yml configuration:
filebeat.prospectors:
- type: log
enabled: true
paths:
- "/var/log/containers/${_SYSTEM_POD_NAME}_${_SYSTEM_NAMESPACE}_${_SYSTEM_APP_NAME}-*.log"
symlinks: true
scan_frequency: 1s
fields:
source: ${_SYSTEM_APP_NAME:default}
host: ${HOSTNAME:default}
tags:
- ${_SYSTEM_NAMESPACE:default}
json.keys_under_root: true
json.add_error_key: true
json.message_key: log
multiline.pattern: ${_SYSTEM_MULTILINE_PATTERN}
multiline.negate: true
multiline.match: after
fields_under_root: true
pipeline: ${_SYSTEM_PIPELINE}
output.elasticsearch:
hosts: ["elasticsearch-logging.dopware:9200"]
worker: 4
enabled: true
timeout: 300
bulk_max_size: 100
and the filebeat log as is bellow
2018/06/07 02:51:40.758011 harvester.go:215: INFO Harvester started for file: /var/log/containers/p1-5c88559c4f-tcsjr_bill-pre_p1-0bb9d1d0bfb2d014c477bb5960c7667e29648d09673f6b5dea3528a229178a45.log
2018/06/07 02:51:41.767124 harvester.go:215: INFO Harvester started for file: /var/log/containers/p1-5c88559c4f-tcsjr_bill-pre_p1-0bb9d1d0bfb2d014c477bb5960c7667e29648d09673f6b5dea3528a229178a45.log
2018/06/07 02:51:42.797308 harvester.go:215: INFO Harvester started for file: /var/log/containers/p1-5c88559c4f-tcsjr_bill-pre_p1-0bb9d1d0bfb2d014c477bb5960c7667e29648d09673f6b5dea3528a229178a45.log
2018/06/07 02:51:44.388487 client.go:276: ERR Failed to perform any bulk index operations: Post http://elasticsearch-logging.dopware:9200/_bulk: EOF
2018/06/07 02:51:44.390459 client.go:276: ERR Failed to perform any bulk index operations: Post http://elasticsearch-logging.dopware:9200/_bulk: EOF
2018/06/07 02:51:45.388701 output.go:92: ERR Failed to publish events: Post http://elasticsearch-logging.dopware:9200/_bulk: EOF
2018/06/07 02:51:45.390669 output.go:92: ERR Failed to publish events: Post http://elasticsearch-logging.dopware:9200/_bulk: EOF
2018/06/07 02:51:47.632400 output.go:74: ERR Failed to connect: 503 Service Unavailable: {
"name" : "elasticsearch-logging-v1-mtgnl",
"cluster_name" : "kubernetes-logging",
"cluster_uuid" : "xI_7pWsuTaCuVuWjjTyurg",
"version" : {
"number" : "6.1.2",
"build_hash" : "5b1fea5",
"build_date" : "2018-01-10T02:35:59.208Z",
"build_snapshot" : false,
"lucene_version" : "7.1.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
2018/06/07 02:51:47.635750 output.go:74: ERR Failed to connect: 503 Service Unavailable: {
"name" : "elasticsearch-logging-v1-mtgnl",
"cluster_name" : "kubernetes-logging",
"cluster_uuid" : "xI_7pWsuTaCuVuWjjTyurg",
"version" : {
"number" : "6.1.2",
"build_hash" : "5b1fea5",
"build_date" : "2018-01-10T02:35:59.208Z",
"build_snapshot" : false,
"lucene_version" : "7.1.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
2018/06/07 02:51:51.478984 metrics.go:39: INFO Non-zero metrics in the last 30s: beat.info.uptime.ms=30000 beat.memstats.gc_next=34604992 beat.memstats.memory_alloc=18190248 beat.memstats.memory_total=14279169192 filebeat.events.active=2 filebeat.events.added=17690 filebeat.events.done=17688 filebeat.harvester.open_files=39 filebeat.harvester.running=38 filebeat.harvester.started=3 libbeat.config.module.running=0 libbeat.output.read.bytes=217078 libbeat.output.read.errors=2 libbeat.output.write.bytes=13305478 libbeat.pipeline.clients=1 libbeat.pipeline.events.active=4117 libbeat.pipeline.events.filtered=3 libbeat.pipeline.events.published=17684 libbeat.pipeline.events.retry=500 libbeat.pipeline.events.total=17687 libbeat.pipeline.queue.acked=17684 registrar.states.current=38 registrar.states.update=17688 registrar.writes=88
2018/06/07 02:52:15.649178 output.go:74: ERR Failed to connect: 503 Service Unavailable: {
"name" : "elasticsearch-logging-v1-mtgnl",
"cluster_name" : "kubernetes-logging",
"cluster_uuid" : "xI_7pWsuTaCuVuWjjTyurg",
"version" : {
"number" : "6.1.2",
"build_hash" : "5b1fea5",
"build_date" : "2018-01-10T02:35:59.208Z",
"build_snapshot" : false,
"lucene_version" : "7.1.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
2018/06/07 02:52:15.650109 client.go:651: INFO Connected to Elasticsearch version 6.1.2
2018/06/07 02:52:15.656559 client.go:651: INFO Connected to Elasticsearch version 6.1.2
2018/06/07 02:52:15.661142 load.go:73: INFO Template already exists and will not be overwritten.
2018/06/07 02:52:15.679937 load.go:73: INFO Template already exists and will not be overwritten.
2018/06/07 02:52:16.110273 harvester.go:215: INFO Harvester started for file: /var/log/containers/p1-5c88559c4f-tcsjr_bill-pre_p1-0bb9d1d0bfb2d014c477bb5960c7667e29648d09673f6b5dea3528a229178a45.log
What is the retry policy of filebeat ? i know the send-at-least once sementics, but it looks like not take affect about this.
Thank you very much!