I have a collection of JSON files that all together have over 1,000 fields. After a first round ingestion, I identified ~65 fields that would be interesting.
I am loading the JSON files into ES via the curl XPOST method if that matters.
I have tried to create the index with a mapping definition that starts like this, with all 65 fields:
PUT pubsjson/_settings
{
"index.mapper.dynamic":false
}
Yet, it still creates all the fields in the JSON file i submit.
what am I missing? Is it possible the old types are still in Elasticsearch's bowels from my first ingestion somehow? I am deleting the index each time.
I understand i could create a mapping with all 1,000+ fields and explicitly ignore those we don't want, but that doesn't scale well as these JSON will evolve over time.
The index setting index.mapper.dynamic is about types and you're trying to prevent new fields from being added. For that, you have to use the dynamic property for objects in the mapping:
The last output shows that the mapping was not updated, the field another_field was silently ignored. You can set this to strict to reject documents that contain fields that are not in the mapping. Note that you must set dynamic to false for any inner objects in your mapping, otherwise new fields could appear on the inner objects.
Please note: _source will still contain the unmapped fields, they will just be ignored for indexing.
Thank you - that was my confusion - source vs index. So, now i want to see if i can ignore data in the source. I now know what to look for in my search.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.