How to create the perfect _id by my own?


I must create my own _id field.
This id is basically a combination of 5 other fields of a document, so for example:

       "a": 100,
       "b": "CUSTOM_TYPE",
       "c": 300,
       "d": "bla"
       "e": "foo",
       "value": " 42

So my key is basically: "100_CUSTOM_TYPE_300_bla_foo" which is a very bad none sortable and none compressed key, but it is my key. I thought about just doing an md5 on it and use the result, but I am not sure if this is the optimal solution.
I looked at how Elasticsearch implemented the id generation, which is based on flake ids, which are time-based, but this is not my case.

Any idea?

Can you clarify why?

Sure, I need to index the same document, essentially update the same one.
I load my data from S3, and index it to Elastic. While those ids are already referenced by other entities in the system. The document in S3 must already contain the id before it is being indexed.

Ok then, so what's the problem with your existing id approach? You don't really want to sort on it, compressibility would also be less than ideal for the default auto-generated id as well.

I'd just keep what you have.

Great thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.