In my team, we've been enthusiastically using the new Rollup API. It provides the functionality that we were waiting for!
However, once we started using it, we noticed that the autogenerated _id for our new rollup index is a random unsigned 32bit integer. This is different than the autogenerated ID strategy that is indicated in this page.
We're worried that the new Rollup index that is generated would run into the Birthday Paradox problem, and thus possibly overwrite documents.
Could someone please explain the autogenerated index strategy of Rollup indexes?
Hello Domenico,
the _id is actually a 32bit CRC at the moment.
It was not possible to use the standard ES GUID as those ids must be deterministic, but it can improved.
An issue has been published for this reason.
Remember the Rollup API is still experimental.
Thank you @Luca_Belluccini for pushing to get this issue tracked openly.
Indeed in my team we have indices that, when rolled up, can indeed pass over the 200k barrier. As such, Rollup is not usable in its current state until we know for sure that we won't generate any collisions.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.