How to avoid multiple doc/event uploading to remote Elasticsearch after connection issue

Hi all,
I'm working on events uploading from local Elasticsearch to remote Elasticsearch with Logstash.
I noticed if remote Elastic is disconnected for some reason, the error looks like this:

"Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections"...

And after: "Restored connection to ES instance"

the same event was sent 5 times to remote Elasticsearch. I must say that in output section I updating the field "uploaded" to be true, in order to avoid multiple uploads, but in this case, it looks like that failed sending was cached many times and sent after connection established.

Any Ideas?

Thanks in advance.

The Elastic Stack is setup to ensure at least once delivery. That means that duplicates can occur.

If you want to avoid this then you will want to define your own _id in each of the events, so that duplicates will update the existing event (document), instead of adding new ones.

1 Like

Thanks @warkolm

Indeed, adding

document_id => "%{[@metadata][_id]}"

to output section in Elasticsearch of logstash.conf solved the issue.

Thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.