Dear Community!
I’m having a strange problem while trying to implement ignore_malformed to avoid documents being dropped when a field contains invalid data.
From my Logstash logs, I often see messages like this:
[...] :response=>{"create"=>{"status"=>400, "error"=>{"type"=>"document_parsing_exception", "reason"=>"[1:1531] failed to parse field [source.ip] of type [ip] in document with id 'cCxu2psBc1CNJ_tCN3uu'. Preview of field's value: '1964'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"'1964' is not an IP string literal."}}}}}
In this case, the log contains an invalid IP address that Elasticsearch tries to parse into a field of type ip. In this example, the value is 1964, which is obviously not an IP address, so Elasticsearch drops the document with the error is not an IP string literal.
I then found the ignore_malformed parameter, which should allow me to ignore such invalid values. This is explained here:
https://www.elastic.co/observability-labs/blog/antidote-index-mapping-exceptions-ignore-malformed
So I modifed my dynamic temlate et reflect this change:
"dynamic_templates": [
{
"ip_addr": {
"path_match": "*.ip",
"mapping": {
"ignore_malformed": true,
"type": "ip"
}
}
},
{
"ports": {
"path_match": "*.port",
"mapping": {
"ignore_malformed": true,
"type": "float"
}
}
},
{
"bytes": {
"path_match": "*.bytes",
"mapping": {
"ignore_malformed": true,
"type": "float"
}
}
},
[...]
]
When I look at the index, I can see that the parameter is present in the dynamic templates section of the index.
However, if I inspect a field—let’s say host.ip—the field type is correctly set to ip, as expected from the *.ip dynamic template, but the parameter "ignore_malformed": true is not present.
Since the field type is ip, I know that the dynamic template is being applied; otherwise, the field would have been mapped as text. But why does it seem that "ignore_malformed": true is not being taken into account?
Here is an example of one of my indices for the field host:
"host": {
"properties": {
"domain": {
"type": "keyword",
"ignore_above": 1024
},
"hostname": {
"type": "keyword",
"ignore_above": 1024
},
"ip": {
"type": "ip" <====== Type as been set to IP, but "ignore_malformed": true is missing
},
Ideally, it should look like this:
"ip": {
"type": "ip",
"ignore_malformed": true
},
I’m a bit confused about why this is not working as expected. Did I miss something, or is there a limitation or special behavior with ignore_malformed and dynamic templates?
I will try to upload the output of my index template from following command:
GET _index_template/filebeat-8.18*
This should show the full definition of my index template.
Thank you all and Best Regards,
Yanick