Ignore_malformed fires mapper_parsing_exception for keyword type on 7.10.1 (it worked on 7.1.0)

This may be a breaking change in v.7.4+ that apparently was not documented. Any thoughts on how to solve this issue with an upgrade on a logs analysis use case?

The docs about the ignore_malformed mapping parameter have some differences when comparing versions 7.3 and 7.4. From 7.4 on it says:

The ignore_malformed setting is currently supported by the following mapping types: Numeric, Date, Date nanoseconds, Geo-point, Geo-shape, IP

– which excludes other types such as keyword.

Trying to index a same document works on 7.1.0 and fails on 7.10.1 (and also 7.8.1, both upgraded from 7.1.0), even though the mappings are nearly the same. The mappings.dynamic_templates config is:

{
  "template_stdField" : {
    "path_match" : "*",
    "mapping" : {
      "ignore_malformed" : true,
      "type" : "keyword"
    }
  }
}

Running

curl -X PUT "localhost:9200/my-index/notes/2" -H 'Content-Type: application/json' -d'
{
  "text": "hello world",
  "mykeyword": "hello world again"
}
'

on v7.1.0 returns:

{
  "_index": "my-index",
  "_type": "notes",
  "_id": "2",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 6,
  "_primary_term": 6
}

however, on on v7.10.1, the same command returns:

{
  "error": {
    "root_cause": [{
      "type": "mapper_parsing_exception",
      "reason": "unknown parameter [ignore_malformed] on mapper [mykeyword] of type [keyword]"
    }],
    "type": "mapper_parsing_exception",
    "reason": "unknown parameter [ignore_malformed] on mapper [mykeyword] of type [keyword]"
  },
  "status": 400
}
Error trace in 7.10.1

Here is the output when adding ?error_trace to the previous command:

{
  "error": {
    "root_cause": [{
      "type": "mapper_parsing_exception",
      "reason": "unknown parameter [ignore_malformed] on mapper [mykeyword] of type [keyword]",
      "stack_trace": "MapperParsingException[unknown parameter [ignore_malformed] on mapper [mykeyword] of type [keyword]]\n\tat org.elasticsearch.index.mapper.ParametrizedFieldMapper$Builder.parse(ParametrizedFieldMapper.java:614)\n\tat org.elasticsearch.index.mapper.ParametrizedFieldMapper$TypeParser.parse(ParametrizedFieldMapper.java:662)\n\tat org.elasticsearch.index.mapper.ParametrizedFieldMapper$TypeParser.parse(ParametrizedFieldMapper.java:647)\n\tat org.elasticsearch.index.mapper.RootObjectMapper.findTemplateBuilder(RootObjectMapper.java:278)\n\tat org.elasticsearch.index.mapper.RootObjectMapper.findTemplateBuilder(RootObjectMapper.java:251)\n\tat org.elasticsearch.index.mapper.DocumentParser.createBuilderFromDynamicValue(DocumentParser.java:704)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseDynamicValue(DocumentParser.java:765)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseValue(DocumentParser.java:620)\n\tat org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:424)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:395)\n\tat org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:112)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:71)\n\tat org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:227)\n\tat org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:803)\n\tat org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:780)\n\tat org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:752)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:285)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:175)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:220)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:126)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:85)\n\tat org.elasticsearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:179)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"
    }],
    "type": "mapper_parsing_exception",
    "reason": "unknown parameter [ignore_malformed] on mapper [mykeyword] of type [keyword]",
    "stack_trace": "MapperParsingException[unknown parameter [ignore_malformed] on mapper [mykeyword] of type [keyword]]\n\tat org.elasticsearch.index.mapper.ParametrizedFieldMapper$Builder.parse(ParametrizedFieldMapper.java:614)\n\tat org.elasticsearch.index.mapper.ParametrizedFieldMapper$TypeParser.parse(ParametrizedFieldMapper.java:662)\n\tat org.elasticsearch.index.mapper.ParametrizedFieldMapper$TypeParser.parse(ParametrizedFieldMapper.java:647)\n\tat org.elasticsearch.index.mapper.RootObjectMapper.findTemplateBuilder(RootObjectMapper.java:278)\n\tat org.elasticsearch.index.mapper.RootObjectMapper.findTemplateBuilder(RootObjectMapper.java:251)\n\tat org.elasticsearch.index.mapper.DocumentParser.createBuilderFromDynamicValue(DocumentParser.java:704)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseDynamicValue(DocumentParser.java:765)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseValue(DocumentParser.java:620)\n\tat org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:424)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:395)\n\tat org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:112)\n\tat org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:71)\n\tat org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:227)\n\tat org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:803)\n\tat org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:780)\n\tat org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:752)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:285)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:175)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:220)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:126)\n\tat org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:85)\n\tat org.elasticsearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:179)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737)\n\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"
  },
  "status": 400
}

Index mapping & settings details
Running

curl "localhost:9200/my-index?pretty"

on v7.1.0 returns (simplified)

{
  "my-index" : {
    "mappings" : {
      "dynamic_templates" : [
        {
          "template_text" : {
            "path_match" : "*text",
            "mapping" : {
              "fielddata" : true,
              "type" : "text"
            }
          }
        },
        /* ... */
        {
          "template_object" : {
            "path_match" : "*",
            "match_mapping_type" : "object",
            "mapping" : {
              "doc_values" : true,
              "type" : "object"
            }
          }
        },
        {
          "template_stdField" : {
            "path_match" : "*",
            "mapping" : {
              "ignore_malformed" : true,
              "type" : "keyword"
            }
          }
        }
      ],
      "properties" : {
        "app" : {
          "type" : "keyword"
        },
        /* ... */
        "text" : {
          "type" : "text",
          "fielddata" : true
        }

      }
    },
    "settings" : {
      "index" : {
        "mapping" : {
          /* ... */
          "ignore_malformed" : "true"
        },
        /* ... */
        "version" : {
          "created" : "7010099"
        }
      }
    }
  }
}

A similar result is seen on v7.10.1, except from the version field:

"version" : {
  "created" : "7010099",
  "upgraded" : "7100199"
}

Maybe this is related to 7.10 no longer accepts boost mapping on some properties #64982 and Unused boost parameter should not throw mapping exception #64999

Using the Put mapping API to update the invalid template should fix this issue. Note that we need to repeat all the previous dynamic templates in the proper order so they don't get removed.

curl -X PUT "localhost:9200/my-index/_mapping?pretty" -H 'Content-Type: application/json' -d'
{
    "dynamic_templates" : [
      /*  ... repeat previous templates ... */
      {
        "template_stdField" : {
          "path_match" : "*",
          "mapping" : {
            "type" : "keyword"
          }
        }
      }
    ]

}
'

IMHO on the upgrade scenario, instead of throwing an exception without indexing the document, a warning message should be logged and the indexing operation continued.

The steps above are incomplete which makes then hard to follow, but this sounds like a bug (at least it's a bug not to document this change).

Could you write out the full instructions that someone would need to run on an empty cluster to reproduce this? If so, please report it on GitHub.

Thanks for your comment David! I apologize for perhaps omitting essential details. On the reply:

I tried to put the minimum subset of my mappings that I believe are needed to reproduce the error. Which option are you looking for?

  1. the complete set of my mappings,
  2. some other index info that I didn't post (pls let me know which one is missing)
  3. a series of REST calls that
    (a) create an empty index,
    (b) set the dynamic_templates mapping with one ignore_malformed keyword match,
    (c) try to index a document to get the error

I tried following your instructions as best I could and I didn't see what you are describing. Here's a transcript of the exact API calls, and responses, run on a brand-new empty 7.1.0 cluster:

GET /?filter_path=version.number

# 200 OK
# {
#   "version": {
#     "number": "7.1.0"
#   }
# }

PUT /my_index
{
  "mappings": {
    "dynamic_templates": [
      {
        "template_stdField": {
          "mapping": {
            "ignore_malformed": true,
            "type": "keyword"
          },
          "path_match": "*"
        }
      }
    ]
  }
}

# 200 OK
# {
#   "shards_acknowledged": true,
#   "acknowledged": true,
#   "index": "my_index"
# }

GET /my_index

# 200 OK
# {
#   "my_index": {
#     "mappings": {
#       "dynamic_templates": [
#         {
#           "template_stdField": {
#             "mapping": {
#               "ignore_malformed": true,
#               "type": "keyword"
#             },
#             "path_match": "*"
#           }
#         }
#       ]
#     },
#     "settings": {
#       "index": {
#         "provided_name": "my_index",
#         "number_of_shards": "1",
#         "uuid": "ymMrojPITo-O6b_Huji1yg",
#         "number_of_replicas": "1",
#         "version": {
#           "created": "7010099"
#         },
#         "creation_date": "1608590601562"
#       }
#     },
#     "aliases": {}
#   }
# }

PUT /my_index/_doc/1
{
  "text": "hello world",
  "mykeyword": "hello world again"
}

# 201 Created
# {
#   "_type": "_doc",
#   "_primary_term": 1,
#   "_id": "1",
#   "_shards": {
#     "successful": 1,
#     "total": 2,
#     "failed": 0
#   },
#   "_index": "my_index",
#   "result": "created",
#   "_version": 1,
#   "_seq_no": 0
# }

Then I upgraded it to 7.10.1, here's a transcript of the API calls and responses after the upgrade:

GET /?filter_path=version.number

# 200 OK
# {
#   "version": {
#     "number": "7.10.1"
#   }
# }

GET /my_index

# 200 OK
# {
#   "my_index": {
#     "mappings": {
#       "dynamic_templates": [
#         {
#           "template_stdField": {
#             "mapping": {
#               "ignore_malformed": true,
#               "type": "keyword"
#             },
#             "path_match": "*"
#           }
#         }
#       ],
#       "properties": {
#         "text": {
#           "type": "keyword"
#         },
#         "mykeyword": {
#           "type": "keyword"
#         }
#       }
#     },
#     "settings": {
#       "index": {
#         "provided_name": "my_index",
#         "number_of_shards": "1",
#         "uuid": "ymMrojPITo-O6b_Huji1yg",
#         "number_of_replicas": "1",
#         "version": {
#           "created": "7010099",
#           "upgraded": "7100199"
#         },
#         "creation_date": "1608590601562"
#       }
#     },
#     "aliases": {}
#   }
# }

PUT /my_index/_doc/2
{
  "text": "hello world",
  "mykeyword": "hello world again"
}

# 201 Created
# {
#   "_type": "_doc",
#   "_primary_term": 2,
#   "_id": "2",
#   "_shards": {
#     "successful": 1,
#     "total": 2,
#     "failed": 0
#   },
#   "_index": "my_index",
#   "result": "created",
#   "_version": 1,
#   "_seq_no": 1
# }

I can't tell what I'm doing differently from you, but I don't get the mapper_parsing_exception that you report.

Oh, I see. I believe since you indexed document 1 with the mykeyword property in v7.1.0, the record was properly added AND the new property mykeyword was added to the index mapping. This then allowed you to use it later on the upgraded system.

Now I believe if you try a different dynamically created property on v7.10.1 when indexing a new document – e.g. mykeyword2 – you should find the mapper_parsing_exception . E.g.:

//Elasticsearch v7.10.1
PUT /my_index/_doc/3
{
  "text": "hello world",
  "mykeyword2": "hello world again"
}

I suppose this issue may not necessarily be unique to upgrade scenarios.

Aha ok I see it now. Yes I'd call that a bug; please report it on Github!

1 Like

Done: https://github.com/elastic/elasticsearch/issues/66765

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.