Logstash prune blacklist_values

So the json inputs are not always consistent. For example, one field is usually true/false, but the source occasionally tosses in a "not_supported" just to be fun. So after some digging, I thought that I could just do:

    prune {
        blacklist_values => [
            "expires", "not_supported"
        ]
      }

but no. The help page says that "blacklist_values" accepts a hash, and even helpfully links to the help pages for hashes, as well as shows an example... except that the example hash looks like:

match => {
  "field1" => "value1"
  "field2" => "value2"
  ...
}

and the example prune looks like:

      prune {
        blacklist_values => [ "uripath", "/index.php",
                              "method", "(HEAD|OPTIONS)",
                              "status", "^[^2]" ]
      }

In experimenting, it seems like the values in "blacklist_values" need to be in pairs (eg: logstash gives an error if the only thing in "blacklist_values" is "not_supported") , but no matter how I use this, I dont get the desired results.

The json that might come in might be something like:

{"item": "first thing", "is_working": true, "is_configured": true}
{"item": "second thing", "is_working": "not_supported", "is_configured": true}

And I want logstash to pump the data to Elasticsearch:

{"item": "first thing", "is_working": true, "is_configured": true}
{"item": "second thing", "is_configured": true}

It would be lovely if someone could point out what I am missing.

I would expect

prune { blacklist_values => { "is_working" => "not_supported" } }

to work.

No, when I do that, I get a config error:

[2022-01-03T15:31:58,367][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"{\", \"}\" at line 15, column 31 (byte 206) after filter {\n\n\n    if [message] == \"[]\" {\n      drop {}\n    }\n\n    json {\n        source => \"message\"\n        target => \"cc-data\"\n    }\n\n    prune {\n        blacklist_values => {\n            \"nil\" => \"expires\"", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:187:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:72:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:47:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:383:in `block in converge_state'"]}

The docs show that the "blacklist_value" block is a list. (eg: square brackets) If I make it a list of objects, I still get an error:

    prune {
        blacklist_values => [
            { "nil" => "not_supported" }
        ]
      }
[2022-01-03T15:47:43,142][ERROR][logstash.javapipeline    ][main] Pipeline error {:pipeline_id=>"main", :exception=>#<TypeError: no implicit conversion of Hash into String>, :backtrace=>["org/jruby/RubyRegexp.java:963:in `initialize'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-prune-3.0.4/lib/logstash/filters/prune.rb:101:in `block in register'", "org/jruby/RubyHash.java:1415:in `each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-prune-3.0.4/lib/logstash/filters/prune.rb:100:in `register'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:75:in `register'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:232:in `block in register_plugins'", "org/jruby/RubyArray.java:1821:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:231:in `register_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:590:in `maybe_setup_out_plugins'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:244:in `start_workers'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:189:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:141:in `block in start'"], "pipeline.sources"=>["/etc/logstash/conf.d/filter-json-cc.conf", "/etc/logstash/conf.d/input-cctest.conf", "/etc/logstash/conf.d/output-elasticsearch.conf"], :thread=>"#<Thread:0x5c006d6a run>"}

The code expects a hash, not an array. Generally it makes no difference, because if the filter expects a hash then logstash will convert an array pairwise into a hash. There are many, many places in the documentation where a plugin option is documented as requiring a hash but the example uses an array.

The following configuration removes the is_working field when it contains "not supported".

input { generator {
    count => 1
    lines => [
        '{"item": "first thing", "is_working": true, "is_configured": true}',
        '{"item": "second thing", "is_working": "not_supported", "is_configured": true}'
    ]
    codec => json
} }
filter {
    mutate { convert => { "is_working" => "string" } }
    prune { blacklist_values => { "is_working" => "not_supported" } }
    mutate { convert => { "is_working" => "boolean" } }
}
output { stdout { codec => rubydebug { metadata => false } } }

You cannot prune the value of a boolean field without converting it.

well... I must have fat-fingered something... a copy-pasta of your code block for "prune" seems to at least start up fine. I did not keep my last attempt, so... not sure what the difference is.

So... thank you so much for your patience!

This is still not pruning the data. This is my code block:

    prune {
        blacklist_values => {
            "[cc-data][c7n:credential-report][password_enabled]" => "not_supported"
            "[cc-data][c7n:credential-report][password_last_changed]" => "not_supported"
            "[cc-data][c7n:credential-report][password_next_rotation]" => "not_supported"
            "password_enabled" => "not_supported"
            "password_last_changed" => "not_supported"
            "password_next_rotation" => "not_supported"
        }
    }

my error:

[logstash.outputs.elasticsearch][main]
...
Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"poc-2022.01.03.17", :routing=>nil}, {
"cc-data"=>[
	{
		"c7n:credential-report"=>{
			"password_last_used"=>"2021-12-08T14:39:20+00:00", 
			"mfa_active"=>false, 
			"user"=>"<root_account>",
			
			"password_last_changed"=>"not_supported", 
			"arn"=>"arn:aws:iam::353563186465:root", 
			"password_next_rotation"=>"not_supported", 
			"user_creation_time"=>"2021-11-15T23:04:55+00:00", 
			
			"password_enabled"=>"not_supported"
		}, 
		"account_name"=>"", 
		"account_id"=>"353563186465"
	}
], 
...
	"error"=>{
		"type"=>"mapper_parsing_exception", 
		"reason"=>"failed to parse field [
			cc-data.c7n:credential-report.password_last_changed
		] 
		of type [date] in document with id 'Qir_IH4BA5b0hmW9wW1I'. Preview of field's value: 'not_supported'", 
		"caused_by"=>{
			"type"=>"illegal_argument_exception", 
			"reason"=>"failed to parse date field [not_supported] with format [strict_date_optional_time||epoch_millis]", 
			"caused_by"=>{
				"type"=>"date_time_parse_exception", 
				"reason"=>"Failed to parse with all enclosed parsers"
			}
		}
	}

As the note at the end of the description section says, prune can only be used on top-level fields. But you are in luck, because blacklist_values is the only one of the four prune operations that can easily be implemented another way (using mutate+remove_field).

        if [cc-data][c7n:credential-report][password_enabled] == "not_supported" { mutate { remove_field => [ "[cc-data][c7n:credential-report][password_enabled]" ] } }
        if [cc-data][c7n:credential-report][password_last_changed]" == "not_supported" { mutate { remove_field => [ "[cc-data][c7n:credential-report][password_last_changed]" ] } }

etc.

Pardon me while I go scream into the void.

Again, thank you so much for your help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.