I'm struggling with a mapping evolution on an index used to backend messages dropped in DLQ in rabbitmq.
We have until now a strict and restricted mapping (as the properties and properties.headers part of the documents).
I wanted to have headers part indexed dynamically so that our tools can filter dlq directly when querying (rather to fetching, unmarshaling and filtering).
but when reindexing old indices I'm facing a rather famous "trying to parse as object but concret value found" at a level I didn't except.
! image|690x147
what is strange, is that new documents seem to just index fine, and dynamic field are well exposed (not sure about the one causing issue in reindex).
also I have some documents that seems to have been reindexed, but actually I don't know how to tell the difference.
what makes me thought that dynamic indexing is working, is the availability of those dynamic build on querying :
! image|516x500
If anyone has thought about it. and specifically why would reindex not work, when index does.
Your issue is related to mapping conflicts, when using dynamic: false, fields that are not explicitly mapped are ignored and are not indexed or searchable and are not added to the mapping of the index.
So when you have different documents where the same field with dynamic as false has a different data type, this conflict is not an issue because this field will not be indexed or added to the mappping.
When you set dynamic to true without having explicit mapping, elasticsearch will create a mapping in the index for each field based on the value of the first time it receives a specific field.
If in a next document the value for the same field results in a conflict, the document will be rejected.
What happened as that when indexing your data, on the first document to have the specific field, it was an object, so elasticsearch will only accept the field when it is an object and it seems that you have documents where the field is not a document.
For example, if you have something like this on one document:
{
"field_name": {
"nested": "value"
}
}
and
{
"field_name": "value"
}
Elasticsearch will not accept both.
In your case since your field seems to be very dynamic, the solution would be to use the flattened type which is explained here in the documentation.
I did search in detail in old index, and found effecitevely two types for the field in error :
first : {}
second: ""
I now understand why my remaining reindexing didn't work. Unfortunately it also means that errors will arise when indexing documents with this field as string.
This field I don't have control over, it is a business application header and they would put in it whatever they need (even if it doesn't seems really used)
I'm going to extend the template, with only fields I want to test in my tools (like x-death.*), thus I can remain strict indexing.
I've also tried flatten, but this one lack typing for efficient filtering (date range).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.