I did something dumb and deployed a service that populates a new field in our Elasticsearch index before applying the corresponding mapping to the index. As a result, about 40 documents were indexed with the default string analyzer instead of not_analyzed. I understand the reasoning for not allowing mapping types to be changed:
If a mapping already exists for a field, data from that field has probably been indexed. If you were to change the field mapping, the indexed data would be wrong and would not be properly searchable.
What I'm wondering is if I can just reindex the offending 40 documents, or even just delete the index for the field, so that the issue of corrupted data from that field no longer matters. This is a production index with almost 300k documents, so reindexing the entire thing and causing a temporary outage would be inconvenient.
I never "fixed" them, they just never had the field to begin with. I would also be fine with deleting the field in the problem docs, but I'm not sure if that would solve my problem either.
Basically what I was hoping was that if I delete any references to that field, that Elasticsearch would let me delete the mapping since the index for that field would be empty. But it sounds like that isn't possible.
I wouldn't mind if searching on that field doesn't work for the 40 that were accidentally indexed, if docs going forward were indexed correctly.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.