Why Elastic search need to reindex the entire document when there is a change in the mapping for one field?
It sounds to me that whenever there is change in the mapping (doesn't matter if it is one or few fields) ES will reindex the entire dataset (i.e it reindexes all documents and all fields that are available in a cluster). If I have change of mapping in one field why not reindex just that one field? Given ES uses inverted index data structure It looks to me that should be straight forward to do. am I totally wrong?
That happens because all these inverted indices together with the document sources are stored together in immutable segments. This immutability is used in many places to our advantage. While ability to change a field without re-indexing an entire document would have been beneficial, it would have also complicated a lot of things and slowed things down.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.