Can Filebeat halt on failed published event?

tunder · September 26, 2018, 8:34am

Hi,

Is there such an option to stop publishing (anything new) when output fails for whatever reason?

To give an example
My Filebeat config (minimal) is like this:

filebeat.prospectors:
- type: log
  json.keys_under_root: true
  paths:
    - /path/to/file.log
output.elasticsearch:
  hosts: ["127.0.0.1:9200"]
  template.enabled: false
  index: "log-%{[index]}"

My log file is already in json format (one json per line)

For different reasons (eg: field type mismatch, missing mappings, elastic down, etc) publish action may fail. I don't want to loose that log line. I'd like Filebeat to stop harvesting and publishing and alert me somehow (lock file, email, smoke signals, anything I can monitor). Then I can fix the issue and restart Filebeat.

All I have now is a WARN in the log like this (I know what is is, it's missing mapping, it's intentional for this test):

2018-09-25T23:21:05.999+0300	WARN	elasticsearch/client.go:502	Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbee2c40c3a701471, ext:15139988, loc:(*time.Location)(0x52ca960)}, Meta:common.MapStr(nil), Fields:common.MapStr{"beat":common.MapStr{"name":"localhost", "hostname":"localhost", "version":"6.2.4"}, "source":"/path/to/file.log", "offset":127, "index":"2018-08-31", "message":"test log for demo", "lines":[]interface {}{common.MapStr{"line":0, "message":"message1"}, common.MapStr{"line":1, "message":"message2"}}, "prospector":common.MapStr{"type":"log"}}, Private:file.State{Id:"", Finished:false, Fileinfo:(*os.fileStat)(0xc4200bdd40), Source:"/path/to/file.log", Offset:127, Timestamp:time.Time{wall:0xbee2c40c3a60e101, ext:14143782, loc:(*time.Location)(0x52ca960)}, TTL:-1, Type:"log", FileStateOS:file.StateOS{Inode:0x23d3f8, Device:0x100000a}}}, Flags:0x1} (status=400): {"type":"illegal_argument_exception","reason":"object mapping [lines] can't be changed from nested to non-nested"}

Thank you

pierhugues · September 26, 2018, 12:54pm

@tunder Hello, we don't have that kind of mechanism inside Filebeat to notify on specific problems other than logs and metrics. Depending on the behavior, let's say that filebeat stop completely sending events to ES this behavior will affect the metrics that Filebeat collect.

These metrics are sent to an Elasticsearch cluster and since metrics are just document in an index we could possibly configure a Watcher job to check if something is wrong and send an email.

tunder · September 26, 2018, 5:56pm

Thank you for taking the time to respond to me.

I'm not keen on having layer on top of another layer on top of another layer to solve a problem.
I'm not using x-pack but even if I did my guess is that the information found inside metrics is the same to the one I found in Filebeat's logs and it says nothing about failed publishing events.

Here is a snipet form the logs:

2018-09-25T23:34:41.192+0300    WARN    elasticsearch/client.go:502 Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbee2c4d80783261e, ext:60036277031, loc:(*time.Location)(0x52ca960)}, Meta:common.MapStr(nil), Fields:common.MapStr{"prospector":common.MapStr{"type":"log"}, "beat":common.MapStr{"name":"localhost", "hostname":"localhost", "version":"6.2.4"}, "index":"2018-08-31", "source":"/path/to/file.log", "offset":8787, "message":"test log for demo", "lines":[]interface {}{common.MapStr{"message":"message1", "line":0}, common.MapStr{"line":1, "message":"message2"}}}, Private:file.State{Id:"", Finished:false, Fileinfo:(*os.fileStat)(0xc4200b9ad0), Source:"/path/to/file.log", Offset:8787, Timestamp:time.Time{wall:0xbee2c4d807471ad8, ext:60032341994, loc:(*time.Location)(0x52ca960)}, TTL:-1, Type:"log", FileStateOS:file.StateOS{Inode:0x23d438, Device:0x100000a}}}, Flags:0x1} (status=400): {"type":"illegal_argument_exception","reason":"object mapping [lines] can't be changed from nested to non-nested"}
2018-09-25T23:35:10.105+0300    INFO    [monitoring]    log/log.go:124  Non-zero metrics in the last 30s    {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":30,"time":30},"total":{"ticks":64,"time":64,"value":64},"user":{"ticks":34,"time":34}},"info":{"ephemeral_id":"cf4151a8-8f03-4da8-ab2d-e042b14a6b01","uptime":{"ms":90012}},"memstats":{"gc_next":4194304,"memory_alloc":2427488,"memory_total":5861520,"rss":1630208}},"filebeat":{"events":{"added":7,"done":7},"harvester":{"open_files":1,"running":1,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"batches":5,"dropped":5,"total":5},"read":{"bytes":2075},"write":{"bytes":11465}},"pipeline":{"clients":1,"events":{"active":0,"filtered":2,"published":5,"retry":1,"total":7},"queue":{"acked":5}}},"registrar":{"states":{"cleanup":1,"current":1,"update":7},"writes":6},"system":{"load":{"1":2.0464,"15":1.5986,"5":1.7344,"norm":{"1":0.2558,"15":0.1998,"5":0.2168}}}}}}

Do you think this is something that can be taken into account for a feature request?
I would like Filebeat to stop until the problem is solved (in this case adding the mappings; in other cases I may just remove the log line that is causing problems but I need to know which line is that).

Best regards

PS: I tried adding xpack.monitoring config parameters (after installing the plugin) and set it up to send the metrics to another elasticsearch cluster(url). Filebeat won't start complaining that

Exiting: 'xpack.monitoring.elasticsearch.hosts' and 'output.elasticsearch.hosts' are configured

pierhugues · September 28, 2018, 12:49pm

I am not sure if we would support stopping complete, I am sure we are tracking the error rate I will need to double check if we expose it in the xpack UI, but this metric could be something to monitor? Higher error rate would mean that we need to investigate.

tunder · September 30, 2018, 10:43am

How about allowing to send metrics to something else than elasticsearch?

My guess is that it is doing a web request (curl) to send metrics to elasticsearch. See my previous PS and allow xpack.monitoring output to be a custom url (maybe a web service endpoint) where we can read the metrics and do our own logic (in my case I'll send email alerts) in case of failed publish events.

I see in the metrics this info:

"output":{"events":{"batches":5,"dropped":5,"total":5}

Does this mean that out of 5 batches 5 were failed?

Thank you

system · October 28, 2018, 10:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat error when parsing .txt log files (Failed to publish events: temporary bulk send failure) Beats filebeat	4	986	October 9, 2018
Filebeat no more working - Can not index event (status=400) Beats filebeat	9	9618	October 13, 2016
Filebeat issue with "cannot index event - dropping event" Beats beats-module , filebeat	1	720	March 21, 2022
Cannot index event publisher.Event Beats fleet , filebeat	7	9279	April 8, 2019
Error publishing events (EOF, broken pipe, i/o timeout, connection reset) Beats filebeat	7	3905	December 26, 2016

Can Filebeat halt on failed published event?

Related topics