[SOLVED] The filter plugin mutate-convert doesn't work in 5.0


(Catalin Tileaga) #1

In my logstash config I have something like this:

        mutate {
            add_field => { "[%{name}][long]" => "%{valueLong}" }
            remove_field => [valueLong]
            convert => ["[%{name}][long]", "integer"]
        }

And the new field should be an integer. Everything works as expected, the new field is added, the old field deleted, but the new field is always text.

This configuration worked without problems before the new Logstash version 5.0. We want to update to the newer version, but we get stuck on this configuration problem. Does anyone has an idea how to solve this?

Thank you.


(Magnus Bäck) #2

You can't assume that add_field, remove_field, and convert will execute in the order given. You have to split your mutate filter in three.


(Catalin Tileaga) #3

Still doesn't work. I try to explain the problem again:
I have a log file with key-value entries. Something like:

timestamp="18.07.2016 15:48:09,614" id="3" name="duration" type="DURATION" valueType="long" valueLong=79

Using logstash I write in two indices in Elastic:

  • one that contains the row data, something like:

    "type" => "DURATION",
    "@timestamp" => 2016-07-18T13:48:09.614Z,
    "valueLong" => 79,
    "valueType" => "long",
    "@version" => "1",
    "name" => "duration",
    "id" => "3"

  • and the second one, we call it summary index and the documents inside look like:

    "type" => "SUMMARY",
    "@timestamp" => 2016-07-18T13:48:09.614Z,
    "@version" => "1",
    "duration" => {
    "valueType" => "long",
    "id" => "3",
    "long" => 79 // this attribute is now inserted as String
    }

If another entry with the same ID will be logged, the first index will contain another document, but the second will update the existing one. Normally when this happens, the name is different and another structure will be created.

And exactly this functionality worked with Logstash < 5.0 using this code:

   mutate {
        add_field => { "[%{name}][long]" => "%{valueLong}" }
        remove_field => [valueLong]
        convert => ["[%{name}][long]", "integer"]
    }

I changed now, as you proposed to ...

        mutate {
            add_field => { "[%{name}][long]" => "%{valueLong}" }
        }
        mutate {
            remove_field => [valueLong]
        }
        mutate {
            convert => ["[%{name}][long]", "integer"]
        }

... but no change at all. duration.long is still a text.


(Magnus Bäck) #4

valueLong already is an integer field so I don't know why you're doing the conversion in the first place. When you say

... but no change at all. duration.long is still a text.

are you saying that the field in Elasticsearch (and, by extension, Kibana) is a string field? In that case you have to reindex the current index or create a new index. The existing mapping won't change just because you're starting to submit documents with an integer [duration][long] field.


(Catalin Tileaga) #5

In my example I wanted to show you, what I actually expect in elastic.

I delete all indexes, before every test.

I tested now without the conversion, but no change. Even if the valueLong field is an integer, the duration.long field is a string in the second index.


(Magnus Bäck) #6

What do the mappings look like? What index templates do you have?


(Catalin Tileaga) #7

The mappings look like: (I deleted some other attributes that are not importante for the discussion). The mappings are dynamically generated.

{
"logstash-summary" : {
"mappings" : {
"SUMMARY" : {
"_all" : {
"enabled" : true,
"norms" : false
},
"dynamic_templates" : [
{
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"norms" : false,
"type" : "text"
}
}
},
{
"string_fields" : {
"match" : "",
"match_mapping_type" : "string",
"mapping" : {
"fields" : {
"keyword" : {
"type" : "keyword"
}
},
"norms" : false,
"type" : "text"
}
}
}
],
"properties" : {
"@timestamp" : {
"type" : "date",
"include_in_all" : false
},
"@version" : {
"type" : "keyword",
"include_in_all" : false
},
"duration" : {
"properties" : {
"id" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"long" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"valueType" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
}
}
},
}
}
}
},
"logstash-guardean" : {
"mappings" : {
"DURATION" : {
"_all" : {
"enabled" : true,
"norms" : false
},
"dynamic_templates" : [
{
"message_field" : {
"path_match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"norms" : false,
"type" : "text"
}
}
},
{
"string_fields" : {
"match" : "
",
"match_mapping_type" : "string",
"mapping" : {
"fields" : {
"keyword" : {
"type" : "keyword"
}
},
"norms" : false,
"type" : "text"
}
}
}
],
"properties" : {
"@timestamp" : {
"type" : "date",
"include_in_all" : false
},
"@version" : {
"type" : "keyword",
"include_in_all" : false
},
"id" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"name" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"type" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
"valueLong" : {
"type" : "long"
},
"valueType" : {
"type" : "text",
"norms" : false,
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
},
}
}
}
}
}


(Magnus Bäck) #8

I still think you have an index template that applies to your new indexes every time. As far as I know Elasticsearch won't default to any dynamic templates so the mappings you have with dynamic templates must come from somewhere.


(Catalin Tileaga) #9

I do nothing in elasticsearch. I suppose that this is the job of logstash. I just have a log file and the logstash with the configuration that I showed you before. What is strange, that I had no problems before 5.0. So I suppose that something have been changed, but I don't know where.

What is also strange is, that I do a mutate-convert before, for the valueLong and it works.

I copy again my entire configuration:

input {
    file {
        path => "C:/Analytics/analytics.log"
        start_position => "beginning"
    }
}

filter {
    kv { }

    mutate {
        convert => ["valueLong", "integer"]
        convert => ["valueDouble", "float"]
        convert => ["valueBoolean", "boolean"]
    }

    clone {
        clones => ["SUMMARY"]
    }

    if [type] == "SUMMARY" {
        if [valueLong] {
            mutate {
                add_field => { "[%{name}][long]" => "%{valueLong}" }
                remove_field => [valueLong]
                convert => ["[%{name}][long]", "integer"]
            }
        }
        if [valueDouble] {
            mutate {
                add_field => { "[%{name}][double]" => "%{valueDouble}" }
                remove_field => [valueDouble]
                convert => ["[%{name}][double]", "float"]
            }
        }
        if [valueBoolean] {
            mutate {
                add_field => { "[%{name}][boolean]" => "%{valueBoolean}" }
                remove_field => [valueBoolean]
                convert => ["[%{name}][boolean]", "boolean"]
            }
        }
        mutate {
            add_field => { "[%{name}][id]" => "%{id}" }
            add_field => { "[%{name}][valueType]" => "%{valueType}" }
            remove_field => [ "name", "id", "valueType" ]
        }
    }
}

output {
    stdout { codec => rubydebug }
    if [type] == "SUMMARY" {
        elasticsearch {
            document_id => "%{executionID}"
            action => "update"
            doc_as_upsert => true
            index => "logstash-summary-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    } else {
        elasticsearch {
            index => "logstash-guardean-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    }
}

(Catalin Tileaga) #10

No other idea? :frowning:


(Catalin Tileaga) #11

I simplified the example to be easy to understand. Everyone can test this really quick:

Logstash Config:

input {
    file {
        path => "C:/Analytics/analytics.log"
        start_position => "beginning"
    }
}

filter {
    kv { }

    mutate {
        convert => ["test", "integer"]
    }

    clone {
        clones => "TEST_CLONE"
    }

    if [type] == "TEST_CLONE" {
        mutate {
            add_field => { "test_clone" => "%{test}" }
            remove_field => [test]
            convert => ["test_clone", "integer"]
        }
    }
}

output {
    stdout { codec => rubydebug }
    if [type] == "TEST_CLONE" {
        elasticsearch {
            index => "logstash-test-clone-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    } else {
        elasticsearch {
            index => "logstash-test-%{+YYYY.MM.dd}"
            hosts => localhost
        }
    }
}

The log file contains just one entry:

test=111

and the result looks like:

{
     "path" => "C:/Analytics/analytics.log",
    "@timestamp" => 2016-11-11T11:52:01.950Z,
    "test" => 111,
    "@version" => "1",
    "host" => "SV-PC-074",
    "message" => "test=111\r"
}
{
    "path" => "C:/Analytics/analytics.log",
    "@timestamp" => 2016-11-11T11:52:01.950Z,
    "test_clone" => "111",
    "@version" => "1",
    "host" => "SV-PC-074",
    "message" => "test=111\r",
    "type" => "TEST_CLONE"
}

As you can see the test field is a number, but test_clone is a string. I think there is a bug, but I'm open for other ideas.


(Magnus Bäck) #12

As I said earlier in the thread you need to split your mutate filter. add_field and remove applies after convert.


(Catalin Tileaga) #13

It works. Strange that I tested before on my complicated configuration and somehow didn't work. I'll try again. Thank you.


(Catalin Tileaga) #14

Now I found the problem in my configuration but not the solution. I copy here just the important part of the configuration (the rest is exactly the same):

  if [type] == "TEST_CLONE" {
        mutate {
            add_field => { "[%{name}][long]" => "%{test}" }
        }
        mutate {
            remove_field => [test]
        }
        mutate {
            convert => ["[%{name}][long]", "integer"]
        }
    }

The log file contains now the line:

name="test_clone" test=111

And now it doesn't work again :frowning: The problem is %{name}. With an hard coded field name, it works.


(Magnus Bäck) #15

Yes, this is a limitation. For now you'll have to use a ruby filter to convert dynamically named fields.


[SOLVED] Logstash 5.2.2 - Mutate string value to number
Logstash: mutate convert doesn't work on dynamic field name
(Catalin Tileaga) #16

Do you have an example?


(Catalin Tileaga) #17

I found it :slight_smile: If someone want to use the workaround, it looks like:

 if [type] == "TEST_CLONE" {
        ruby {
            code => 'event.set("[" + event.get("name") + "][long]", event.get("test"))'
        }
        mutate {
            remove_field => [test]
        }
        mutate {
            convert => ["[%{name}][long]", "integer"]
        }
    }

(system) #18

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.