Setting a long custom _id value when inserting into Elastic search

I'm reading data with filebeat, sending to logstash and it ends up in ES. One issue we have is sometimes Filebeat re-reads files leading to duplicate entries in Elastic Search.

One idea was to create a custom _id value in logstash based on the filename + offset. I would assume that the filename rarely is greater then 32 characters, the files we process are generally 500MB or less, so we are looking at up to another 9 digits. So in most cases the id_value would be 30-40 characters. (Perhaps we could hash this to make it smaller)

Anyone know of any performance issues, or done anything similar?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.