For deduplication purpose, we would like to provide our own id when inserting/updating documents.
We do not need versioning of documents. Sender always do "UPSERTS" (i.e. it doesn't know if document has already been indexed).
Question is multifold :
- Is there performance impact when providing an external ID (because ES will not now if it is a create/update) ?
- Will providing target version (always "first version") of the document help (because ES will not have to find out a new version available to store document) ?
- Are there advices on the form of the external ID ( max size, sharding algorithm...) that affect performance/storage space ?