Mappings validation

I have a question about the data types in Elasticsearch index mappings.

I create an index:

curl -X PUT "localhost:9200/test1?pretty" -H 'Content-Type: application/json' -d'{}'

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test1"
}

And then add a strict mapping, with one field called "population" of type "long":

curl -X PUT "localhost:9200/test1/_mapping/_doc?pretty" -H 'Content-Type: application/json' -d'
{
  "properties": {
    "population": {
      "type": "long"
    }
  },
  "dynamic":"strict"
}'

{
  "acknowledged" : true
}

The mappings are as intended:

curl "localhost:9200/test1/_mapping"

{"test1":{"mappings":{"_doc":{"dynamic":"strict","properties":{"population":{"type":"long"}}}}}}%

I ingest a document with "population" set to 12 and it takes it fine:

curl -XPOST "localhost:9200/test1/_doc" -H 'Content-Type: application/json' -d '{"population":12}'

{"_index":"test1","_type":"_doc","_id":"BkaHym4BEgq13vgdHAx3","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}%

The document comes back fine in search:

curl "localhost:9200/test1/_search"

{"took":133,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test1","_type":"_doc","_id":"BkaHym4BEgq13vgdHAx3","_score":1.0,"_source":{"population":12}}]}}%

But surprisingly, when I try to index a document with an invalid value, say "12.4", it does not reject the document:

curl -XPOST "localhost:9200/test1/_doc" -H 'Content-Type: application/json' -d '{"population":12.4}'

{"_index":"test1","_type":"_doc","_id":"B0aIym4BEgq13vgdDQyb","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}%

And the bad document shows up in search too:

curl "localhost:9200/test1/_search"

{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"test1","_type":"_doc","_id":"BkaHym4BEgq13vgdHAx3","_score":1.0,"_source":{"population":12}},{"_index":"test1","_type":"_doc","_id":"B0aIym4BEgq13vgdDQyb","_score":1.0,"_source":{"population":12.4}}]}}%

I haven't set the ignore_malformed to true. https://www.elastic.co/guide/en/elasticsearch/reference/6.8/ignore-malformed.html

It, however, does throw the expected mapper_parsing_exception if I send in, say, a string:

curl -XPOST "localhost:9200/test1/_doc" -H 'Content-Type: application/json' -d '{"population":"large"}'

{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse field [population] of type [long]"}],"type":"mapper_parsing_exception","reason":"failed to parse field [population] of type [long]","caused_by":{"type":"illegal_argument_exception","reason":"For input string: \"large\""}},"status":400}%

Is there a way to force the ignore_malformed behavior for all data types?

Take a look at the coerce mapping option.

Thank you. That worked.

Is there a way to set the default value of index.mapping.coerce to false for all indices?

$ curl -X POST "localhost:9200/test1/_close"
{"acknowledged":true}%                                                                                                                                                                                        

$ curl -X PUT "localhost:9200/test1/_settings" -H 'Content-Type: application/json' -d '{"index.mapping.coerce": false}'
{"acknowledged":true}%                                                                                                                                                                                        

$ curl -X POST "localhost:9200/test1/_open"
{"acknowledged":true,"shards_acknowledged":true}%        
                                                                                                                                                     
$ curl -XPOST "localhost:9200/test1/_doc" -H 'Content-Type: application/json' -d '{"population":12.4}'
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse field [population] of type [long]"}],"type":"mapper_parsing_exception","reason":"failed to parse field [population] of type [long]","caused_by":{"type":"illegal_argument_exception","reason":"12.4 cannot be converted to Long without data loss"}},"status":400}%                                                                    

$ curl -XPOST "localhost:9200/test1/_doc" -H 'Content-Type: application/json' -d '{"population":13}'
{"_index":"test1","_type":"_doc","_id":"CkZBy24BEgq13vgdzAy9","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":2}%                                 !?

$

There is no direct option, but a small workaround.

You could use an index template and disabling coercing for the whole index in that template. See https://www.elastic.co/guide/en/elasticsearch/reference/7.5/coerce.html#coerce-setting