Documents with duplicate _id in an index split by time?

mykael · September 11, 2020, 3:58am

Might just be me misunderstanding how ids and indexing work here.

I've got a stream of records and I write them to an index - my_data

Now, to keep it manigable, because there are lots of records, I'm telling logstash to split the index into segments by time:

elasticsearch {
  hosts => ["localhost"]
  document_id => "%{id}"
  index => "my_data-%{+YYYY.MM}" 
}

I then define an index_pattern of my_data-* to drive Kibana.

I've found a case where I initially got the timestamp on a record wrong and when I re-indexed it with the correct timestamp, instead of it replacing the old version with the same ID (the timestamp isn't in the id), it simply created a second copy in the index segment that corresponded to the new timestamp.

Is there anything I have to do to get the id to be unique across all of the index segments? I really need it to be unique within the index pattern...

Christian_Dahlqvist · September 11, 2020, 5:05am

Document ids are unique within a single index only. You can not make them unique across multiple indices linked by an index pattern.

system · October 9, 2020, 5:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Setting unique _id Elasticsearch	1	596	April 25, 2018
Duplicate _id docs and docs with multiple _id values - how is this possible? Logstash	3	1645	April 3, 2019
Logstash generating duplicated index Logstash	1	466	September 5, 2017
Duplicated date in my elastic Logstash	6	315	November 1, 2022
Avoid duplicate document in different Indices,Logsatsh Logstash	2	495	July 28, 2022

Documents with duplicate _id in an index split by time?

Related topics