I'm confused by the timing when global ordinal is rebuilt.
According to Elasticsearch definitive guide, it's rebuilt when a new aggregation query comes after a refresh/delete/merge. But as global ordinal will be used by doc values, and doc values is built during indexing, how does it work?
Doc values are indeed created at index time, and maintain segment-level ordinals. But the set of global ordinals for all the segments is not built unless it is needed by an aggregation. The global ordinals are only known after finding all the unique values across all the doc values and compiling the total set of ordinals, which is fairly expensive. So that's why it is deferred until needed.
If new segments are created during indexing the global oridinals often have to be rebuilt, as there are new terms which invalidate the old ordinals.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.