We use the centralized pipeline manager in Kibana to update and upload Logstash pipeline(s) to our Logstash server. In general, everything works fine, but in the case an output filter is unable to process an event due to wrong configurations. Logstash keeps retrying the event. This is not an issue, but when we update the pipeline (through cpm) with a patched version. It gets stuck on Converge PipelineAction::Reload, because there are still messages inflight needing to be processed. But these can't be processed because the pipeline will not get patched. And it seems to create a race condition. Is there a solution for this?
Not sure if it will fix it, but you could try setting pipeline.unsafe_shutdown, which allows data loss at shutdown.
@Badger I suppose that would make the restart of Logstash easier, but that goes against the purpose of the centralized pipeline manager in my opinion. I don't really want to touch the external server. The entire management preferably would go through cpm. Also, I would rather avoid any data loss.
Something else that was weird about the situation was that we also have a DLQ configured and implemented. But none of the failed messages ended up in the DLQ. Even not after retrying for about an hour.