All,
Below are couple of areas where I need more clarification. Any help is appreciated.
- We are indexing data from multiple threads in parallel and different threads could try to index the data with same id. The reason is because we get data against different contexts that are managed outside the system. The content of the context do overlap and come in streams
Question: When I index the data with same id in parallel on a cluster, will be there be any exceptions? It should not be as it is a re-index.
NOTE:
Version is not passed explicitly from outside
The content is indexed and never updated on a specific attribute by parallel thread, though it may index the content with different values to an attribute. But the later case is very rare.
- _id that represents a content uniquely may contain below format of data. Sometimes it is guid, sometimes it is from a different algorithm
000015DCD378AEB62D6577008F74CE0D0D00000000000000
00002170CC0B770937CEDE89174373F03500000000000000
4420c7d30-56b765c7107-384142
46abefc10-46abefc117-72157665504465145
736b8362d3e8c953d3107219b76a7059
Question: Any restrictions in _id field (or) any performance related configuration to be specified for _id field at mapping level
Thanks.