Confusion on disable automatic type creation


(Cord Thomas) #1

I have a collection of JSON files that all together have over 1,000 fields. After a first round ingestion, I identified ~65 fields that would be interesting.

I am loading the JSON files into ES via the curl XPOST method if that matters.

I have tried to create the index with a mapping definition that starts like this, with all 65 fields:

PUT /pubsjson/_mapping/pub
{
"_all" : { "enabled" : false },
"dynamic": false,
"properties" : {
"navTitle" : { "type": "text" },
"par.product.long_abstract" : { "type": "text" },
...

When I try this, dynamic fields are still created.

I tried "dynamic": "strict" and others.

From https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-mapping.html#_disabling_automatic_type_creation i tried recreating my index and applied this:

PUT pubsjson/_settings
{
"index.mapper.dynamic":false
}

Yet, it still creates all the fields in the JSON file i submit.

what am I missing? Is it possible the old types are still in Elasticsearch's bowels from my first ingestion somehow? I am deleting the index each time.

I understand i could create a mapping with all 1,000+ fields and explicitly ignore those we don't want, but that doesn't scale well as these JSON will evolve over time.


(Jason Tedor) #2

The index setting index.mapper.dynamic is about types and you're trying to prevent new fields from being added. For that, you have to use the dynamic property for objects in the mapping:

$ curl -XPUT localhost:9200/i?pretty=true -d '
> {
>   "mappings": {
>     "my_type": {
>       "dynamic": false,
>       "properties": {
>         "my_field": {
>.          "type": "long"
>.        }
>       }
>     }
>   }
> }'
{
  "acknowledged" : true,
  "shards_acknowledged" : true
}

$ curl -XPOST localhost:9200/i/my_type/1?pretty=true -d '
> {
>   "my_field": 123,
>   "another_field": 456
> }'
{
  "_index" : "i",
  "_type" : "my_type",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "created" : true
}
$ curl -XGET localhost:9200/i/my_type/_mapping?pretty=true
{
  "i" : {
    "mappings" : {
      "my_type" : {
        "dynamic" : "false",
        "properties" : {
          "my_field" : {
            "type" : "long"
          }
        }
      }
    }
  }
}

The last output shows that the mapping was not updated, the field another_field was silently ignored. You can set this to strict to reject documents that contain fields that are not in the mapping. Note that you must set dynamic to false for any inner objects in your mapping, otherwise new fields could appear on the inner objects.

Please note: _source will still contain the unmapped fields, they will just be ignored for indexing.


(Cord Thomas) #3

Thank you - that was my confusion - source vs index. So, now i want to see if i can ignore data in the source. I now know what to look for in my search.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.