[SOLVED] The filter plugin mutate-convert doesn't work in 5.0

In my logstash config I have something like this:

        mutate {
            add_field => { "[%{name}][long]" => "%{valueLong}" }
            remove_field => [valueLong]
            convert => ["[%{name}][long]", "integer"]
        }

And the new field should be an integer. Everything works as expected, the new field is added, the old field deleted, but the new field is always text.

This configuration worked without problems before the new Logstash version 5.0. We want to update to the newer version, but we get stuck on this configuration problem. Does anyone has an idea how to solve this?

Thank you.

You can't assume that add_field, remove_field, and convert will execute in the order given. You have to split your mutate filter in three.

3 Likes

Still doesn't work. I try to explain the problem again:
I have a log file with key-value entries. Something like:

timestamp="18.07.2016 15:48:09,614" id="3" name="duration" type="DURATION" valueType="long" valueLong=79

Using logstash I write in two indices in Elastic:

  • one that contains the row data, something like:

    "type" => "DURATION",
    "@timestamp" => 2016-07-18T13:48:09.614Z,
    "valueLong" => 79,
    "valueType" => "long",
    "@version" => "1",
    "name" => "duration",
    "id" => "3"

  • and the second one, we call it summary index and the documents inside look like:

    "type" => "SUMMARY",
    "@timestamp" => 2016-07-18T13:48:09.614Z,
    "@version" => "1",
    "duration" => {
    "valueType" => "long",
    "id" => "3",
    "long" => 79 // this attribute is now inserted as String
    }

If another entry with the same ID will be logged, the first index will contain another document, but the second will update the existing one. Normally when this happens, the name is different and another structure will be created.

And exactly this functionality worked with Logstash < 5.0 using this code:

   mutate {
        add_field => { "[%{name}][long]" => "%{valueLong}" }
        remove_field => [valueLong]
        convert => ["[%{name}][long]", "integer"]
    }

I changed now, as you proposed to ...

        mutate {
            add_field => { "[%{name}][long]" => "%{valueLong}" }
        }
        mutate {
            remove_field => [valueLong]
        }
        mutate {
            convert => ["[%{name}][long]", "integer"]
        }

... but no change at all. duration.long is still a text.

valueLong already is an integer field so I don't know why you're doing the conversion in the first place. When you say

... but no change at all. duration.long is still a text.

are you saying that the field in Elasticsearch (and, by extension, Kibana) is a string field? In that case you have to reindex the current index or create a new index. The existing mapping won't change just because you're starting to submit documents with an integer [duration][long] field.

In my example I wanted to show you, what I actually expect in elastic.

I delete all indexes, before every test.

I tested now without the conversion, but no change. Even if the valueLong field is an integer, the duration.long field is a string in the second index.

What do the mappings look like? What index templates do you have?

The mappings look like: (I deleted some other attributes that are not importante for the discussion). The mappings are dynamically generated.

{
"logstash-summary" : {
"mappings" : {
"SUMMARY" : {
"_all" : {
"enabled" : true,
"norms" : false
},
"dynamic_templates" : [
{
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"norms" : false,
"type" : "text"
}
}
},
{
"string_fields" : {
"match" : "",
"match_mapping_type" : "string",
"mapping" : {
"fields" : {
"keyword" : {
"type" : "keyword"
}
},
"norms" : false,
"type" : "text"
}
}
}
],
"properties" : {
"@timestamp" : {
"type" : "date",
"include_in_all" : false
},
"@version" : {
"type" : "keyword",
"include_in_all" : false
},
"duration" : {
"properties" : {
"id" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"long" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"valueType" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
}
}
},
}
}
}
},
"logstash-guardean" : {
"mappings" : {
"DURATION" : {
"_all" : {
"enabled" : true,
"norms" : false
},
"dynamic_templates" : [
{
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"norms" : false,
"type" : "text"
}
}
},
{
"string_fields" : {
"match" : "
",
"match_mapping_type" : "string",
"mapping" : {
"fields" : {
"keyword" : {
"type" : "keyword"
}
},
"norms" : false,
"type" : "text"
}
}
}
],
"properties" : {
"@timestamp" : {
"type" : "date",
"include_in_all" : false
},
"@version" : {
"type" : "keyword",
"include_in_all" : false
},
"id" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"name" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"type" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"valueLong" : {
"type" : "long"
},
"valueType" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
}
}
}
}
}

I still think you have an index template that applies to your new indexes every time. As far as I know Elasticsearch won't default to any dynamic templates so the mappings you have with dynamic templates must come from somewhere.

I do nothing in elasticsearch. I suppose that this is the job of logstash. I just have a log file and the logstash with the configuration that I showed you before. What is strange, that I had no problems before 5.0. So I suppose that something have been changed, but I don't know where.

What is also strange is, that I do a mutate-convert before, for the valueLong and it works.

I copy again my entire configuration:

input {
    file {
        path => "C:/Analytics/analytics.log"
        start_position => "beginning"
    }
}

filter {
    kv { }

    mutate {
        convert => ["valueLong", "integer"]
        convert => ["valueDouble", "float"]
        convert => ["valueBoolean", "boolean"]
    }

    clone {
        clones => ["SUMMARY"]
    }

    if [type] == "SUMMARY" {
        if [valueLong] {
            mutate {
                add_field => { "[%{name}][long]" => "%{valueLong}" }
                remove_field => [valueLong]
                convert => ["[%{name}][long]", "integer"]
            }
        }
        if [valueDouble] {
            mutate {
                add_field => { "[%{name}][double]" => "%{valueDouble}" }
                remove_field => [valueDouble]
                convert => ["[%{name}][double]", "float"]
            }
        }
        if [valueBoolean] {
            mutate {
                add_field => { "[%{name}][boolean]" => "%{valueBoolean}" }
                remove_field => [valueBoolean]
                convert => ["[%{name}][boolean]", "boolean"]
            }
        }
        mutate {
            add_field => { "[%{name}][id]" => "%{id}" }
            add_field => { "[%{name}][valueType]" => "%{valueType}" }
            remove_field => [ "name", "id", "valueType" ]
        }
    }
}

output {
    stdout { codec => rubydebug }
    if [type] == "SUMMARY" {
        elasticsearch {
            document_id => "%{executionID}"
            action => "update"
            doc_as_upsert => true
            index => "logstash-summary-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    } else {
        elasticsearch {
            index => "logstash-guardean-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    }
}

No other idea? :frowning:

I simplified the example to be easy to understand. Everyone can test this really quick:

Logstash Config:

input {
    file {
        path => "C:/Analytics/analytics.log"
        start_position => "beginning"
    }
}

filter {
    kv { }

    mutate {
        convert => ["test", "integer"]
    }

    clone {
        clones => "TEST_CLONE"
    }

    if [type] == "TEST_CLONE" {
        mutate {
            add_field => { "test_clone" => "%{test}" }
            remove_field => [test]
            convert => ["test_clone", "integer"]
        }
    }
}

output {
    stdout { codec => rubydebug }
    if [type] == "TEST_CLONE" {
        elasticsearch {
            index => "logstash-test-clone-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    } else {
        elasticsearch {
            index => "logstash-test-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    }
}

The log file contains just one entry:

test=111

and the result looks like:

{
     "path" => "C:/Analytics/analytics.log",
    "@timestamp" => 2016-11-11T11:52:01.950Z,
    "test" => 111,
    "@version" => "1",
    "host" => "SV-PC-074",
    "message" => "test=111\r"
}
{
    "path" => "C:/Analytics/analytics.log",
    "@timestamp" => 2016-11-11T11:52:01.950Z,
    "test_clone" => "111",
    "@version" => "1",
    "host" => "SV-PC-074",
    "message" => "test=111\r",
    "type" => "TEST_CLONE"
}

As you can see the test field is a number, but test_clone is a string. I think there is a bug, but I'm open for other ideas.

As I said earlier in the thread you need to split your mutate filter. add_field and remove applies after convert.

It works. Strange that I tested before on my complicated configuration and somehow didn't work. I'll try again. Thank you.

Now I found the problem in my configuration but not the solution. I copy here just the important part of the configuration (the rest is exactly the same):

  if [type] == "TEST_CLONE" {
        mutate {
            add_field => { "[%{name}][long]" => "%{test}" }
        }
        mutate {
            remove_field => [test]
        }
        mutate {
            convert => ["[%{name}][long]", "integer"]
        }
    }

The log file contains now the line:

name="test_clone" test=111

And now it doesn't work again :frowning: The problem is %{name}. With an hard coded field name, it works.

Yes, this is a limitation. For now you'll have to use a ruby filter to convert dynamically named fields.

Do you have an example?

I found it :slight_smile: If someone want to use the workaround, it looks like:

 if [type] == "TEST_CLONE" {
        ruby {
            code => 'event.set("[" + event.get("name") + "][long]", event.get("test"))'
        }
        mutate {
            remove_field => [test]
        }
        mutate {
            convert => ["[%{name}][long]", "integer"]
        }
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.