Kv filter has no support for this type of data

Hi,

I've got the following setup: Netflow -> remote logstash -> file -> file transfer -> server logstash -> Elasticsearch.

On the server logstash I'm trying to parse the netflow file but the log outputs kv filter has no support for this type of data kv filter has no support for this type of data {:type=>Hash, :value=>{"output_snmp"=>3, "forwarding_status"=>{"reason"=>0, "status"=>1}, "in_pkts"=>1, "ipv4_dst_addr"=>"10.0.10.10", "first_switched"=>"2017-11-17T20:07:07.999Z", "flowset_id"=>257, "l4_src_port"=>1030, "version"=>9, "application_id"=>"0:0", "flow_seq_num"=>12691, "ipv4_src_addr"=>"10.0.0.111", "in_bytes"=>92, "protocol"=>17, "flow_end_reason"=>2, "last_switched"=>"2017-11-18T13:51:07.999Z", "input_snmp"=>0, "out_pkts"=>1, "out_bytes"=>64, "l4_dst_port"=>8888}}. Stdout doesn't look totally correct either.

Netflow file

{"@version":"1","host":"172.16.10.111","netflow":{"output_snmp":3,"forwarding_status":{"reason":0,"status":1},"in_pkts":1,"ipv4_dst_addr":"10.0.10.10","first_switched":"2017-11-17T20:07:07.999Z","flowset_id":257,"l4_src_port":1030,"version":9,"application_id":"0:0","flow_seq_num":12691,"ipv4_src_addr":"10.0.0.111","in_bytes":92,"protocol":17,"flow_end_reason":2,"last_switched":"2017-11-18T13:51:07.999Z","input_snmp":0,"out_pkts":1,"out_bytes":64,"l4_dst_port":8888},"@timestamp":"2017-11-15T02:19:26.000Z","type":"netflow","tags":["Test"]}

server config file

input {
  file {
    path => "/home/test3.txt"
    sincedb_path => "/dev/null"
    start_position => "beginning"
	codec => json {
}
}
}

filter {

    kv {
    source => "netflow"
    value_split => ":"
    field_split => ","
}
}

output {
  stdout {
    codec => rubydebug
  }
}

stdout

{
          "path" => "/home/test3.txt",
       "netflow" => {
              "output_snmp" => 3,
        "forwarding_status" => {
            "reason" => 0,
            "status" => 1
        },
                  "in_pkts" => 1,
            "ipv4_dst_addr" => "10.0.10.10",
           "first_switched" => "2017-11-17T20:07:07.999Z",
               "flowset_id" => 257,
              "l4_src_port" => 1030,
                  "version" => 9,
           "application_id" => "0:0",
             "flow_seq_num" => 12691,
            "ipv4_src_addr" => "10.0.0.111",
                 "in_bytes" => 92,
                 "protocol" => 17,
          "flow_end_reason" => 2,
            "last_switched" => "2017-11-18T13:51:07.999Z",
               "input_snmp" => 0,
                 "out_pkts" => 1,
                "out_bytes" => 64,
              "l4_dst_port" => 8888
    },
    "@timestamp" => 2017-11-15T02:19:26.000Z,
      "@version" => "1",
          "host" => "172.16.10.111",
          "type" => "netflow",
          "tags" => [
        [0] "Test"
    ]
}

How can I make logstash parse the file correctly?

What exactly were you expecting? Could you elaborate?

Also, I doubt KV will work directly on this type of input. You might have to combine grok with it!

For the logs to not produce errors :stuck_out_tongue_winking_eye:

I was under the impression that the KV filter would just divide the fields but not care about the data inside those fields otherwise. That is why I'm not sure why logstash is generating those errors.

Your data is divided into fields. You don't need a kv filter.

Magnus,

Do you know why I can't delete any fields under netflow with the test data from the first post and this config? Deleting the logstash fields (host etc.) works fine by for example the version field is not removed.

input {
  file {
    path => "/home/test/Desktop/test/netflow/test3.txt"
    sincedb_path => "/dev/null"
    start_position => "beginning"
	codec => json {
}
}
}


filter {

    mutate {
    remove_field => ["host","version"]

}
}

Your events don't have a version field at the top level of the event.

You do however have a version subfield of netflow and the correct way of addressing that field is [netflow][version].

https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#logstash-config-field-references

You make it look so easy.

Now I understand nested fields and how to work with them.

Are there any up/downsides to working with nested fields? A did a quick search and it appears that nested fields are slightly faster when searching but are harder to deal with if data changes afterwards?

In my case data will never change after putting it in Elastic put I will require a lot of searches.

Is there a easy way to "unnest" fields?

Are there any up/downsides to working with nested fields? A did a quick search and it appears that nested fields are slightly faster when searching but are harder to deal with if data changes afterwards?

Where did you read this?

Is there a easy way to "unnest" fields?

You can use a mutate filter to move (rename) them to the top level.

It sounds like you may be mixing up nested fields with the nested datatype, which is used with nested documents. There is no performance penalty to using nested fields.

@magnusbaeck: I think I misread, it was about nested fields and parent/child relations.

@Christian_Dahlqvist: Does it make any difference whether I use "normal" or nested fields? The data is easier to read for me without nested fields and it will save a bit of data on transmission by not having an additional field (some data will be sent over very low bandwidth lines so every byte I can save is one) but ultimately search performance will be very important.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.