Looking for a way to prune fields with certain text in them but struggling with the prune filter and regex


#1

I'm pushing logstash node stats into Elasticsearch with the following pipeline setup:

input {
  exec {
    command => "curl -s http://localhost:9600/_node/stats"
    interval => 15
  }
}

filter {
  json {
    source => "message"
  }

  prune {
    whitelist_names => [ "events", "host", "command", "id", "jvm.uptime_in_millis", "process", "timestamp", "@timestamp", ".*pipelines.*.events.*" ]
  }

  mutate {
    add_field => { "type" => "logstash-stats" }
    rename => { "host" => "logstash_host" }
  }
}

However I'm struggling with the prune part and specifically trying to get rid of the 'plugins', 'reloads', and 'queue' nested fields underneath each 'pipeline' object that comes in from the stats, leaving just the 'events' section behind for the top level events and the events field from each 'pipeline' object.

There is a top-level 'events' field that I'd like to keep, plus the 'events' field and nested values under that for each pipeline under 'pipelines'. Here is a sample JSON output that I'm working with that I've annotated to show what I'm looking to keep vs throwaway.

I did some research and found that the prune plugin seems ideal, and I've tried playing with various regex field values in the whitelist_names setting, but with no success. I can easily keep the top level events part, but every time I try to keep the pipelines.pipeline_name.events fields I always tend to get the other fields like pipelines.pipeline_name.plugins/filters/etc that I do not want.

Any idea how I can craft this to achieve what I'm looking for?

Thanks!


#2

It's not pretty, but I've managed to work around this (albeit with a bit of hardcoding of field names which I was hoping to avoid), like this:

input {
  exec {
    command => "curl -s http://localhost:9600/_node/stats"
    interval => 15
  }
}

filter {
  json {
    source => "message"
  }

  mutate {
    copy => { "[pipelines][ingest_gg_app_logs][events]" => "ingest_gg_app_logs_events" }
    copy => { "[pipelines][ingest_kubernetes_logs][events]" => "ingest_kubernetes_logs_events" }
    copy => { "[pipelines][ingest_logstash_redis_logs][events]" => "ingest_logstash_redis_logs_events" }
    copy => { "[pipelines][ingest_miscellaneous_logs][events]" => "ingest_miscellaneous_logs_events" }
    copy => { "[pipelines][ingest_nginx_logs][events]" => "ingest_nginx_logs_events" }
    copy => { "[pipelines][ingest_redis_clones][events]" => "ingest_redis_clones_events" }
    copy => { "[pipelines][ingest_sqs_logs][events]" => "ingest_sqs_logs_events" }
    copy => { "[pipelines][node_stats_logs][events]" => "node_stats_logs_events" }
  }

  prune {
    whitelist_names => [ "events", "host", "id", "jvm.uptime_in_millis", "token", "timestamp", "@timestamp", "ingest_gg_app_logs_events", "ingest_kubernetes_logs_events", "ingest_logstash_redis_logs_events", "ingest_miscellaneous_logs_events", "ingest_nginx_logs_events", "ingest_redis_clones_events", "ingest_sqs_logs_events", "node_stats_logs_events" ]
  }

  mutate {
    add_field => { "token" => "localdev123" }
    add_field => { "type" => "logstash-stats" }
    rename => { "host" => "logstash_host" }
  }

}

I found this workaround of using copy in this github issue on the prune filter: https://github.com/logstash-plugins/logstash-filter-prune/issues/12#issuecomment-434454754


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.