Ingest pipeline errors and "on_failure"

Hello,

I am storing log file where there is an IP field, which can be either ip address or "-".
I want to store the value in field of type "ip" and assign "null" value when log line contains "-".

My mapping:
"ip": { "type": "ip" }

Using grok expression, I store the value of ip field in ip_tmp variable:
%{NOTSPACE:ip_tmp}
and then trying to parse it as IP:

{ "rename": { "field": "ip_tmp", "target_field": "ip", "on_failure" : [ { "set": { "field": "ip", "value": "null" } } ] } },

But I still get an exception for 2017-01-29T00:00:06 189 200 - GET / -:

2017/02/16 10:59:06.722097 client.go:432: WARN Can not index event (status=400) : {"type":"mapper_parsing_exception","reason":"failed to parse [ip]","caused_by":{"type":"illegal_argument_exception","reason":"'-' is not an IP string literal."}}

But my understanding was that if rename fails to parse the value as ip, then on_failure should fire and assign null.

Am I wrong?

Thanks.

Hi @John16,

the problem is that on_failure will only be used if there is an error in this ingest processor. However, the error happens at index time and that is after the document (successfully) went through the ingest pipeline.

I came up with the following solution:

DELETE /logs

PUT /logs
{
    "mappings": {
        "web": {
            "properties": {
                "ip": {
                    "type": "ip"
                }
            }
        }
    }
}

PUT _ingest/pipeline/ip-cleanup
{
   "description": "IP cleanup",
   "processors": [
      {
         "script": {
            "lang": "painless",
            "inline": "ctx.ip_tmp = (ctx.ip_tmp == '-') ? null : ctx.ip_tmp"
         }
      },
      {
         "rename": {
            "field": "ip_tmp",
            "target_field": "ip"
         }
      }
   ]
}

POST /logs/web?pipeline=ip-cleanup
{
    "ip_tmp": "8.8.8.8"
}

POST /logs/web?pipeline=ip-cleanup
{
    "ip_tmp": "-"
}

GET /logs/web/_search
{
    "query": {
        "match_all": {}
    }
}

Daniel

Well, it does not work actually.
I get an exception:

2017/02/18 15:13:35.580963 client.go:432: WARN Can not index event (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse [ip]","caused_by":{"type":"illegal_argument_exception","reason":"'-' is not an IP string literal."}}

But if I change the type for ip field to keyword, I see in the imported log line:
"ip": "-",

So the value is indeed -, but for some reason script does not work.

I also tried to change null in your script to ctx.duration, just to be sure that part of code is really executed, and then I see that the value of ctx.duration is correctly assigned to ip field. So null is the root of the problem: it does not work for some reason.

And another question: is it possible to define some action for such cases?
Right now, it an error happens at index time, ES throughs an exception and filebeat will try to send the same log line forever, and new log lines will never reach ES.

I want to ignore such error or even better to send them to another index for later analysis.

Thanks.

Hi @John16,

did you try my complete example? I verified that it works fine on a default out-of-the-box configuration of Elasticsearch 5.2.0. Did you disable scripting? Did you maybe use the wrong ingest pipeline?

From what I read from the Filebeat docs this is not possible but it could make sense to ask in the Filebeat forum.

Daniel

Well, yes. Sorry. I tried again and it does work. I am not sure now why it failed before.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.