Elasticsearch ingest pipeline - on failure field removal problem

Hello.
I am trying to test if the field is a valid IP address. If not, I want to remove it.
I also stick to the ECS convention.

My ingest pipeline:

PUT _ingest/pipeline/ingestion_pipeline_geo
{
  "description": "geoenrichment",
  "processors": [
    {
      "dot_expander": {
        "field": "host.ip",
        "ignore_failure": true
      }
    },
    {
      "grok": {
        "field": "host.ip",
        "patterns": [
          "^%{IP:host.ip}$"
        ],
        "on_failure": [
          {
            "remove": {
              "field": "host.ip",
              "ignore_failure": true,
              "ignore_missing": true
            }
          }
        ]
      }
    }
  ]
}

My data:

PUT my_index/_doc/1?pipeline=ingestion_pipeline_geo
{
  "host.ip": "1.2.3.433",
  "host.name": "h0012",
  "host.id": "prod server"
}

My result:

GET my_index/_doc/1
{
  "_index" : "my_index",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 13,
  "_seq_no" : 12,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "host" : { },
    "host.name" : "h0012",
    "host.id" : "prod server"
  }
}

When the IP is valid all works fine:

PUT my_index/_doc/2?pipeline=ingestion_pipeline_geo
{
  "host.ip": "1.2.3.42",
  "host.name": "h0012",
  "host.id": "prod server"
}
GET my_index/_doc/2
{
  "_index" : "my_index",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 1,
  "_seq_no" : 13,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "host" : {
      "ip" : "1.2.3.42"
    },
    "host.name" : "h0012",
    "host.id" : "prod server"
  }
}

Why I do fail to remove the field host.ip and being left with the "host": {}?

Is there any other workaround than removing "host" in the on_failure section? (Which is not accurate tbh). eg.

PUT _ingest/pipeline/ingestion_pipeline_geo
{
  "description": "geoenrichment",
  "processors": [
    {
      "dot_expander": {
        "field": "host.ip",
        "ignore_failure": true
      }
    },
    {
      "grok": {
        "field": "host.ip",
        "patterns": [
          "^%{IP:host.ip}$"
        ],
        "on_failure": [
          {
            "remove": {
              "field": "host",
              "ignore_failure": true,
              "ignore_missing": true
            }
          }
        ]
      }
    }
  ]
}

PUT my_index/_doc/2?pipeline=ingestion_pipeline_geo
{
  "host.ip": "1.2.3.42s",
  "host.name": "h0012",
  "host.id": "prod server"
}

GET my_index/_doc/2
{
  "_index" : "my_index",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 2,
  "_seq_no" : 14,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "host.name" : "h0012",
    "host.id" : "prod server"
  }
}

Observed behavior is on ES 7.2.0.
Thanks for any hints.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.