We are using functionbeat to forward CloudWatch logs and some custom events via SQS and we are trying to find the best way to make sure events are not lost.
We identified different types of errors that can be handled differently:
- failures during ingest pipeline processing - we can use pipeline-level
on_faluremethod and forward failed events to a dedicated index
- functionbeat can’t connect to ES - after 3 retries the message can be forwarded to the dead letter queue
- failures during index time - e.g. missing permissions, mapping exception - not handled by ingest pipeline on_falure method and also not forwarded to dead letter queue. Events are just dropped, the only evidence is in functionbeat logs
The last one is tricky. Why these kinds of failures are just dropped and not forwarded to the dead letter queue? What is the recommended way of handling these kinds of issues?