Kv filter has no support for this type of data

Sjaak01 · December 1, 2017, 3:01am

Hi,

I've got the following setup: Netflow -> remote logstash -> file -> file transfer -> server logstash -> Elasticsearch.

On the server logstash I'm trying to parse the netflow file but the log outputs kv filter has no support for this type of data kv filter has no support for this type of data {:type=>Hash, :value=>{"output_snmp"=>3, "forwarding_status"=>{"reason"=>0, "status"=>1}, "in_pkts"=>1, "ipv4_dst_addr"=>"10.0.10.10", "first_switched"=>"2017-11-17T20:07:07.999Z", "flowset_id"=>257, "l4_src_port"=>1030, "version"=>9, "application_id"=>"0:0", "flow_seq_num"=>12691, "ipv4_src_addr"=>"10.0.0.111", "in_bytes"=>92, "protocol"=>17, "flow_end_reason"=>2, "last_switched"=>"2017-11-18T13:51:07.999Z", "input_snmp"=>0, "out_pkts"=>1, "out_bytes"=>64, "l4_dst_port"=>8888}}. Stdout doesn't look totally correct either.

Netflow file

{"@version":"1","host":"172.16.10.111","netflow":{"output_snmp":3,"forwarding_status":{"reason":0,"status":1},"in_pkts":1,"ipv4_dst_addr":"10.0.10.10","first_switched":"2017-11-17T20:07:07.999Z","flowset_id":257,"l4_src_port":1030,"version":9,"application_id":"0:0","flow_seq_num":12691,"ipv4_src_addr":"10.0.0.111","in_bytes":92,"protocol":17,"flow_end_reason":2,"last_switched":"2017-11-18T13:51:07.999Z","input_snmp":0,"out_pkts":1,"out_bytes":64,"l4_dst_port":8888},"@timestamp":"2017-11-15T02:19:26.000Z","type":"netflow","tags":["Test"]}

server config file

input {
  file {
    path => "/home/test3.txt"
    sincedb_path => "/dev/null"
    start_position => "beginning"
	codec => json {
}
}
}

filter {

    kv {
    source => "netflow"
    value_split => ":"
    field_split => ","
}
}

output {
  stdout {
    codec => rubydebug
  }
}

stdout

{
          "path" => "/home/test3.txt",
       "netflow" => {
              "output_snmp" => 3,
        "forwarding_status" => {
            "reason" => 0,
            "status" => 1
        },
                  "in_pkts" => 1,
            "ipv4_dst_addr" => "10.0.10.10",
           "first_switched" => "2017-11-17T20:07:07.999Z",
               "flowset_id" => 257,
              "l4_src_port" => 1030,
                  "version" => 9,
           "application_id" => "0:0",
             "flow_seq_num" => 12691,
            "ipv4_src_addr" => "10.0.0.111",
                 "in_bytes" => 92,
                 "protocol" => 17,
          "flow_end_reason" => 2,
            "last_switched" => "2017-11-18T13:51:07.999Z",
               "input_snmp" => 0,
                 "out_pkts" => 1,
                "out_bytes" => 64,
              "l4_dst_port" => 8888
    },
    "@timestamp" => 2017-11-15T02:19:26.000Z,
      "@version" => "1",
          "host" => "172.16.10.111",
          "type" => "netflow",
          "tags" => [
        [0] "Test"
    ]
}

How can I make logstash parse the file correctly?

NerdSec · December 1, 2017, 4:06am

What exactly were you expecting? Could you elaborate?

Also, I doubt KV will work directly on this type of input. You might have to combine grok with it!

Sjaak01 · December 1, 2017, 5:28am

For the logs to not produce errors

I was under the impression that the KV filter would just divide the fields but not care about the data inside those fields otherwise. That is why I'm not sure why logstash is generating those errors.

magnusbaeck · December 1, 2017, 6:48am

Your data is divided into fields. You don't need a kv filter.

Sjaak01 · December 1, 2017, 7:17am

Magnus,

Do you know why I can't delete any fields under netflow with the test data from the first post and this config? Deleting the logstash fields (host etc.) works fine by for example the version field is not removed.

input {
  file {
    path => "/home/test/Desktop/test/netflow/test3.txt"
    sincedb_path => "/dev/null"
    start_position => "beginning"
	codec => json {
}
}
}


filter {

    mutate {
    remove_field => ["host","version"]

}
}

magnusbaeck · December 1, 2017, 9:53am

Your events don't have a version field at the top level of the event.

You do however have a version subfield of netflow and the correct way of addressing that field is [netflow][version].

https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#logstash-config-field-references

Sjaak01 · December 4, 2017, 7:46am

You make it look so easy.

Now I understand nested fields and how to work with them.

Are there any up/downsides to working with nested fields? A did a quick search and it appears that nested fields are slightly faster when searching but are harder to deal with if data changes afterwards?

In my case data will never change after putting it in Elastic put I will require a lot of searches.

Is there a easy way to "unnest" fields?

magnusbaeck · December 4, 2017, 8:15am

Are there any up/downsides to working with nested fields? A did a quick search and it appears that nested fields are slightly faster when searching but are harder to deal with if data changes afterwards?

Where did you read this?

Is there a easy way to "unnest" fields?

You can use a mutate filter to move (rename) them to the top level.

Christian_Dahlqvist · December 4, 2017, 8:22am

It sounds like you may be mixing up nested fields with the nested datatype, which is used with nested documents. There is no performance penalty to using nested fields.

Sjaak01 · December 6, 2017, 6:39am

@magnusbaeck: I think I misread, it was about nested fields and parent/child relations.

@Christian_Dahlqvist: Does it make any difference whether I use "normal" or nested fields? The data is easier to read for me without nested fields and it will save a bit of data on transmission by not having an additional field (some data will be sent over very low bandwidth lines so every byte I can save is one) but ultimately search performance will be very important.

system · January 3, 2018, 6:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.