Filebeat - How set always retry publish events?!

maxozerov · June 11, 2019, 1:47pm

Please help...
Trying to change the configuration (filebeat.yml), but not successfully - is it possible - always retry publish events?

2019-06-11T16:31:43+03:00 DBG  [publish] Publish event: {
  "@timestamp": "2019-06-11T13:31:43.219Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.1.1"
  },
  "message": "2019-06-11 15:29:57 status CHECK MSG UNV-14. Elastic",
  "source": "/var/log/test/test-log.log",
  "offset": 1860973,
  "tags": [
    "service-X",
    "web-tier"
  ],
  "prospector": {
    "type": "log"
  },
  "beat": {
    "name": "f2d54ccd2fdd",
    "hostname": "f2d54ccd2fdd",
    "version": "6.1.1"
  }
}
2019-06-11T16:31:43+03:00 DBG  [harvester] End of file reached: /var/log/test/test-log.log; Backoff now.
2019-06-11T16:31:44+03:00 DBG  [harvester] End of file reached: /var/log/test/test-log.log; Backoff now.
2019-06-11T16:31:44+03:00 DBG  [elasticsearch] PublishEvents: 1 events have been  published to elasticsearch in 3.965201ms.
2019-06-11T16:31:44+03:00 WARN Can not index event (status=403): {"type":"cluster_block_exception","reason":"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"}
2019-06-11T16:31:44+03:00 DBG  [memqueue] ackloop: receive ack [1: 0, 1]
2019-06-11T16:31:44+03:00 DBG  [memqueue] broker ACK events: count=1, start-seq=2, end-seq=2
2019-06-11T16:31:44+03:00 DBG  [memqueue] ackloop: return ack to broker loop:1
2019-06-11T16:31:44+03:00 DBG  [memqueue] ackloop:  done send ack
2019-06-11T16:31:44+03:00 DBG  [registrar] Processing 1 events
2019-06-11T16:31:44+03:00 DBG  [registrar] Registrar states cleaned up. Before: 1, After: 1
2019-06-11T16:31:44+03:00 DBG  [registrar] Write registry file: /var/lib/filebeat/registry
2019-06-11T16:31:44+03:00 DBG  [registrar] Registry file updated. 1 states written.
2019-06-11T16:31:46+03:00 DBG  [harvester] End of file reached: /var/log/test/test-log.log; Backoff now.

that is to say - when Elastic returned to its normal state (i.e. for example: "index.blocks.read_only_allow_delete": null" - How to force Filebeat to re-send data?
If status=403 - retry ... retry... and always try to send data?
What are the configuration options? Is it possible to always try to send data if any negative http-response from elastic?

maxozerov · June 11, 2019, 4:04pm

Try block port
iptables -A INPUT -p tcp --destination-port 9200 -j DROP
Then modify log-file (harvester) - and ACCEPT port 9200...
yeah all ok:
Events Publish (retry) successfully

2019-06-11T18:57:29+03:00 DBG  [elasticsearch] ES Ping(url=http://172.27.247.89:9200)
2019-06-11T18:57:29+03:00 DBG  [elasticsearch] Ping status code: 200
2019-06-11T18:57:29+03:00 INFO Connected to Elasticsearch version 6.6.0
2019-06-11T18:57:29+03:00 DBG  [elasticsearch] HEAD http://172.27.247.89:9200/_template/filebeat-6.1.1  <nil>
2019-06-11T18:57:29+03:00 INFO Template already exists and will not be overwritten.
2019-06-11T18:57:30+03:00 DBG  [elasticsearch] PublishEvents: 1 events have been  published to elasticsearch in 264.609789ms.
2019-06-11T18:57:30+03:00 DBG  [memqueue] ackloop: receive ack [0: 0, 1]
2019-06-11T18:57:30+03:00 DBG  [memqueue] broker ACK events: count=1, start-seq=1, end-seq=1

2019-06-11T18:57:30+03:00 DBG  [memqueue] ackloop: return ack to broker loop:1
2019-06-11T18:57:30+03:00 DBG  [memqueue] ackloop:  done send ack
2019-06-11T18:57:30+03:00 DBG  [registrar] Processing 1 events
2019-06-11T18:57:30+03:00 DBG  [registrar] Registrar states cleaned up. Before: 1, After: 1
2019-06-11T18:57:30+03:00 DBG  [registrar] Write registry file: /var/lib/filebeat/registry
2019-06-11T18:57:30+03:00 DBG  [registrar] Registry file updated. 1 states written.
2019-06-11T18:57:30+03:00 DBG  [elasticsearch] PublishEvents: 1 events have been  published to elasticsearch in 232.612751ms.
2019-06-11T18:57:30+03:00 DBG  [memqueue] ackloop: receive ack [1: 0, 1]
2019-06-11T18:57:30+03:00 DBG  [memqueue] broker ACK events: count=1, start-seq=2, end-seq=2

2019-06-11T18:57:30+03:00 DBG  [memqueue] ackloop: return ack to broker loop:1
2019-06-11T18:57:30+03:00 DBG  [memqueue] ackloop:  done send ack
2019-06-11T18:57:30+03:00 DBG  [registrar] Processing 1 events
2019-06-11T18:57:30+03:00 DBG  [registrar] Registrar states cleaned up. Before: 1, After: 1
2019-06-11T18:57:30+03:00 DBG  [registrar] Write registry file: /var/lib/filebeat/registry
2019-06-11T18:57:30+03:00 DBG  [registrar] Registry file updated. 1 states written.
2019-06-11T18:57:31+03:00 DBG  [harvester] End of file reached: /var/log/test/test-log.log; Backoff now.

BUT IT dont'work (does not re-send any events) - IF Elastic return 403 HTTP_code

maxozerov · June 13, 2019, 10:50am

Hello everyone, help, advice...
Or resend is supported only at the TCP-level?
I'll try to check again

shaunak · June 13, 2019, 10:35pm

Can you post the contents of /var/lib/filebeat/registry at these 3 points in time?

Before you start Filebeat.
After Filebeat is running and you get the 403 from Elasticsearch.
After the 403 from Elasticsearch is resolved.

Thanks,

Shaunak

maxozerov · June 14, 2019, 12:01pm

@shaunak - thanks for the answer.
Steps:

Before you start Filebeat.

filebeat is running and Log-file not modify

2019-06-14T14:38:05+03:00 DBG  [prospector] Start next scan
2019-06-14T14:38:05+03:00 DBG  [prospector] Check file for harvesting: /var/log/test/test-log.log
2019-06-14T14:38:05+03:00 DBG  [prospector] Update existing file for harvesting: /var/log/test/test-log.log, offset: 1861344
2019-06-14T14:38:05+03:00 DBG  [prospector] File didn't change: /var/log/test/test-log.log

registry file

cat /var/lib/filebeat/registry
[{"source":"/var/log/test/test-log.log","offset":1861344,"timestamp":"2019-06-14T14:13:32.524660399+03:00","ttl":10800000000000,"type":"log","FileStateOS":{"inode":3301,"device":53}}]

After Filebeat is running and you get the 403 from Elasticsearch.

Now, GET 403 From Elastic (just set watermark with very low values)

echo "2019-06-14 13:46:57 status CHECK MSG UNV-22. Elastic" >> /var/log/test/test-log.log
...
2019-06-14T14:41:42+03:00 DBG  [elasticsearch] PublishEvents: 1 events have been  published to elasticsearch in 49.89973ms.
2019-06-14T14:41:42+03:00 WARN Can not index event (status=403): {"type":"cluster_block_exception","reason":"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"}
...
2019-06-14T14:41:42+03:00 DBG  [registrar] Write registry file: /var/lib/filebeat/registry
...
2019-06-14T14:41:53+03:00 DBG  [prospector] Update existing file for harvesting: /var/log/test/test-log.log, offset: 1861397
...

registry file

cat /var/lib/filebeat/registry
[{"source":"/var/log/test/test-log.log","offset":1861397,"timestamp":"2019-06-14T14:41:42.296175844+03:00","ttl":10800000000000,"type":"log","FileStateOS":{"inode":3301,"device":53}}]

After the 403 from Elasticsearch is resolved.

Now Resolve: "403 from Elasticsearch"
And try send new event: (like: echo "2019-06-14 14:46:57 status CHECK MSG UNV-23. Elastic" >> /var/log/test/test-log.log)

2019-06-14T14:47:27+03:00 DBG  [elasticsearch] PublishEvents: 1 events have been  published to elasticsearch in 62.280662ms.
2019-06-14T14:47:27+03:00 DBG  [memqueue] ackloop: receive ack [4: 0, 1]
2019-06-14T14:47:27+03:00 DBG  [memqueue] broker ACK events: count=1, start-seq=5, end-seq=5
2019-06-14T14:47:27+03:00 DBG  [memqueue] ackloop: return ack to broker loop:1
2019-06-14T14:47:27+03:00 DBG  [memqueue] ackloop:  done send ack
2019-06-14T14:47:27+03:00 DBG  [registrar] Processing 1 events
2019-06-14T14:47:27+03:00 DBG  [registrar] Registrar states cleaned up. Before: 1, After: 1
2019-06-14T14:47:27+03:00 DBG  [registrar] Write registry file: /var/lib/filebeat/registry
...
2019-06-14T14:47:29+03:00 DBG  [prospector] Update existing file for harvesting: /var/log/test/test-log.log, offset: 1861450
...

registry file

cat /var/lib/filebeat/registry
[{"source":"/var/log/test/test-log.log","offset":1861450,"timestamp":"2019-06-14T14:47:27.326214001+03:00","ttl":10800000000000,"type":"log","FileStateOS":{"inode":3301,"device":53}}]

In Kibana

Time                                                message                         
June 14th 2019, 14:47:26.262    2019-06-14 14:46:57 status CHECK MSG UNV-23. Elastic
June 14th 2019, 14:03:27.484    2019-06-14 13:46:57 status CHECK MSG UNV-21. Elastic

As you can see the line (with event: "CHECK MSG UNV-22") is missing

maxozerov · June 14, 2019, 1:10pm

Part of the config used in this test
filebeat.yml
#-------------------------- Elasticsearch output -------------------------------

output:
    elasticsearch:
        backoff.init: 1s
        backoff.max: 60s
        bulk_max_size: 2048
        compression_level: 3
        enabled: true
        hosts:
        - 172.27.247.89:9200
        max_retries: 100
        ssl.enabled: false
        timeout: 30s
        ttl: 30s
        worker: 1

Version:

filebeat: 6.1.1 (amd64), libbeat 6.1.1
Elastic: 6.6.0, Build: default/tar/a9861f4/2019-01-24T11:27:09.439740Z, JVM: 11.0.1

shaunak · June 15, 2019, 1:09am

Hi, I'm able to reproduce your scenario locally. I also looked at the code for Filebeat (libbeat) where this error is coming from. Basically, a 403 response from Elasticsearch is treated as a "hard" failure, that is not as a transient or retry-able failure, and therefore those events are not being retried.

There is no configuration to tell Filebeat to always retry events, no matter the type of failure.

If you have a GitHub account, please feel free to open an issue about this here: https://github.com/elastic/beats/issues/new/choose. That way you can explain your use case, participate in the discussion, and follow the progress of the issue. If you don't have a GitHub account, let me know and I'll open the issue for you.

Thanks,

Shaunak

maxozerov · June 17, 2019, 1:37pm

Thx Shaunak!!

May be useful for those who read this duscuss
Open issue: https://github.com/elastic/beats/issues/12572

system · July 15, 2019, 1:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat stops processing and does not retry Beats filebeat	3	1723	September 4, 2017
Filebeat Connecting error publishing events Error Beats filebeat	7	2378	April 8, 2017
Filebeat can't send logs after Elasticsearch cluster failure Beats filebeat	1	300	May 7, 2019
Filebeat is blocked since the bulk send failure Beats filebeat	20	9149	December 6, 2018
Filebeat lost data when elasticsearch is down Beats filebeat	5	2605	July 6, 2018

Filebeat - How set always retry publish events?!

Related topics