Reason for creating a custom document_id?

_finack · February 7, 2020, 2:54pm

I've seen various Logstash filter plugins customizing the document_id field in pipeline filters (e.g. the HELK project). I'm trying to make sense of the reason for doing this. Doesn't a unique document_id get applied automatically?

I'm thinking of the document_id as akin to a unique key in a RDBMS table just as a way to identify an individal record. Perhaps I just haven't discovered what the document_id is useful for beyond allowing the system to refer to the particular document.

New Elastic Stack user here. I apologize for such a beginner question.

Badger · February 7, 2020, 3:46pm

If you ingest the same set of documents more than once (because they have been updated) you want to overwrite the target document, so it has to have the same document_id. If you do not set it then a new unique id will be applied, resulting in a second copy of the document.

system · March 6, 2020, 3:46pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elastic document_id Logstash	6	1677	April 27, 2023
Creating a custom id in a mapping Elasticsearch	7	18001	December 7, 2016
Duplicate messages when using custom document_id Logstash	1	752	March 12, 2018
Index a document from Logstash using the ‘document_id Logstash	2	772	January 2, 2017
Logstash elasticsearch output plugin Document_id and Upserts Logstash	2	430	October 8, 2021

Reason for creating a custom document_id?

Related topics