With the following config file:
name: "filebeat log harvester"
logging.level: debug
filebeat.prospectors:
- input_type: log
paths:
- /var/log/nginx/access_log.json
encoding: utf-8
document_type: access_log
output.logstash:
enabled: true
hosts: ["elastic1:5044"]
worker: 1
pipelining: 10
ssl.enabled: false
ssl.verification_mode: "none"
index: "cdn_log"
IF the 'pipelining' option above is enabled (the value doesn't seem to matter, can even be 1), I get duplicated events send to logstash (and from that duplicates sent to elastic and influx), this is how it looks like in the logs (we log directly to json from nginxs):
2017/03/31 13:36:23.334528 client.go:184: DBG Publish: {
"@timestamp": "2017-03-31T13:36:18.334Z",
"beat": {
"hostname": "tr",
"name": "filebeat log harvester",
"version": "5.2.2"
},
"input_type": "log",
"message": "{\"@timestamp\":\"2017-03-31T13:36:13+00:00\",\"host\":\"test-headers.cdn.local\",\"remote_addr\":\"108.85.255.46\",\"body_bytes_sent\":\"738\",\"status\":\"206\",\"request_method\":\"GET\",\"request_scheme\":\"http\",\"request_uri\":\"/alpha/beta/gamma/baka.txt\",\"request_protocol\":\"HTTP/1.1\",\"request_time\":\"0.008\",\"request_length\":\"198\",\"http_referrer\":\"-\",\"http_user_agent\":\"curl/7.53.1\",\"tenant_id\":\"1\",\"vhost_id\":\"4\",\"upstream_cache_status\":\"HIT\",\"upstream_addr\":\"-\",\"upstream_connect_time\":\"-\",\"upstream_header_time\":\"-\",\"upstream_response_time\":\"-\",\"upstream_status\":\"-\",\"geoip_country_code\":\"US\",\"access_filter\":\"\"}",
"offset": 16193,
"source": "/var/log/nginx/access_log.json",
"type": "access_log"
}
2017/03/31 13:36:23.334567 output.go:109: DBG output worker: publish 1 events
2017/03/31 13:36:23.334581 context.go:93: DBG forwards msg with attempts=-1
2017/03/31 13:36:23.334637 context.go:98: DBG message forwarded
2017/03/31 13:36:23.334661 context.go:138: DBG events from worker worker queue
2017/03/31 13:36:23.334680 async.go:134: DBG Try to publish 1 events to logstash with window size 10
2017/03/31 13:36:23.335154 client.go:194: DBG handle error: write tcp 172.19.0.17:47042->172.19.0.11:5044: write: connection reset by peer
2017/03/31 13:36:23.335188 async.go:115: DBG 0 events out of 1 events sent to logstash. Continue sending
2017/03/31 13:36:23.335202 async.go:68: DBG close connection
2017/03/31 13:36:23.335214 client.go:110: DBG closing
2017/03/31 13:36:23.335278 async.go:190: ERR Failed to publish events caused by: write tcp 172.19.0.17:47042->172.19.0.11:5044: write: connection reset by peer
2017/03/31 13:36:23.335295 async_worker.go:152: DBG handleResults
2017/03/31 13:36:23.335310 async_worker.go:155: DBG handle publish error: write tcp 172.19.0.17:47042->172.19.0.11:5044: write: connection reset by peer
2017/03/31 13:36:24.335463 async_worker.go:74: DBG close client (done=false)
2017/03/31 13:36:24.335486 async.go:68: DBG close connection
2017/03/31 13:36:24.335495 async.go:63: DBG connect
2017/03/31 13:36:24.336159 context.go:126: DBG events from retries queue
2017/03/31 13:36:24.336194 async.go:134: DBG Try to publish 1 events to logstash with window size 5
2017/03/31 13:36:24.337089 async.go:115: DBG 1 events out of 1 events sent to logstash. Continue sending
2017/03/31 13:36:24.337121 context.go:126: DBG events from retries queue
2017/03/31 13:36:24.337134 async.go:134: DBG Try to publish 1 events to logstash with window size 5
2017/03/31 13:36:24.343035 async_worker.go:152: DBG handleResults
2017/03/31 13:36:24.343111 async_worker.go:196: DBG async bulk publish success
2017/03/31 13:36:24.343163 async.go:115: DBG 1 events out of 1 events sent to logstash. Continue sending
2017/03/31 13:36:24.343203 sync.go:68: DBG Events sent: 2
2017/03/31 13:36:24.343232 registrar.go:275: DBG Processing 2 events
2017/03/31 13:36:24.343244 registrar.go:261: DBG Registrar states cleaned up. Before: 2, After: 2
2017/03/31 13:36:24.343255 registrar.go:298: DBG Write registry file: /var/lib/filebeat/registry
2017/03/31 13:36:24.343500 registrar.go:323: DBG Registry file updated. 2 states written.
2017/03/31 13:36:24.344404 async_worker.go:152: DBG handleResults
2017/03/31 13:36:24.344424 async_worker.go:196: DBG async bulk publish success
2017/03/31 13:36:25.335549 log_file.go:84: DBG End of file reached: /var/log/nginx/access_log.json; Backoff now.
Is this expected behaviour ? With pipelining disabled everything works fine.
I'm lost at this point if I'm misusing this option or perhaps is it experimental ?