How to modify output codec for logstash

I'm not expecting a step by step on this.

But here is the backstory and what I hope to achieve.

I have logstash decoding netflow and putting it out using json_lines
This is working well, but it's very verbose and too large to store.

How can I modify which fields are put out in this file? I'm hoping to pare it down to just a few fields.

Secondarily
Is it possible to convert it to plain text? When i use plain I get a timestamp and %message

I appreciate any direction on this.

You might find a prune filter helpful for removing top-level fields (you can either whitelist fields to keep or blacklist fields to remove). mutate can also remove fields, and in the worst case you can resort to ruby.

What do you mean by plain text? A plain codec by default will emit the timestamp, hostname, and contents of [message]. You can tell it to use a different format. If you really only do want a handful of fields you could supply the list of fields in the format option of the codec and not bother pruning the rest.

codec => plain { format => "foo is %{foo}. bar is %{bar}" } }

Hey,

Okay. So the prune filter definitely looks like what I want but I'm having a hard time getting what I'm looking for.

my flow logs are as such

{"host":"1.1.1.1","@timestamp":"2019-09-27T20:09:48.000Z","@version":"1","netflow":{"ipv4_dst_addr":"2.2.2.2","src_as":16509,"ipv4_src_addr":"3.3.3.3","dst_as":0,"in_bytes":84,"first_switched":"2019-09-27T20:04:48.832Z","last_switched":"2019-09-27T20:04:48.832Z","input_snmp":95,"in_pkts":1,"flow_seq_num":206430370,"l4_dst_port":2048,"flowset_id":256,"version":9,"protocol":1,"l4_src_port":0,"ipv4_next_hop":"4.4.4.4"}}

And my config is this

input {
  udp {
    port  => 9995
    codec => netflow
  }
}
filter{
   mutate {
     copy => { "[netflow][ipv4_dst_addr][ipv4_src_addr][low_seq_num]" => "what_i_want" }
 }
   prune {
     whitelist_names => [ "what_i_want" ]
 }
}

output {
   s3{
     access_key_id => "X"
     secret_access_key => "X"
     bucket => "1logflowtest"
     codec =>  "json_lines"
     id => "NetflowV9"
     #encoding => "gzip"
     size_file => "1024000000"
     time_file => "60"
    }
  file {
     path => "/var/log/logstash/test.log"
     codec => "json_lines"
  }
}

and the output is {} heh.

I'm looking to cherry pick fields. I'd like to start with just Source and destination and go from there.

Your filter should look more like

    mutate {
        copy => {
             "[netflow][ipv4_dst_addr]" => "[ipv4_dst_addr]"
             "[netflow][ipv4_src_addr]" => "[ipv4_src_addr]"
             "[netflow][flow_seq_num]" => "[flow_seq_num]"
        }
    }
    prune {
        whitelist_names => [ "flow_seq_num", "ipv4_dst_addr", "ipv4_src_addr", "@timestamp" ]
    }

Thank you, @Badger
I was able to build on what you posted to fine tune something that was relevant.

One last question to avoid making a new thread. Is there any way to have the full output sent off to another destination or would that require a separate conf file ?

Not sure I understand the question. You can have multiple outputs in a configuration. But you clearly know that, since you have two outputs in your existing configuration.

Sorry, let me clarify.

I would like one output to be unaffected by the prune.

You can do that using pipeline to pipeline communications with a forked path pattern.

If you are running on an old version you can do it by using a clone filter, then making the prune and output conditional upon the type set by the clone.