Gap in basic understanding of message processing of Logstash

Hi Guys,

I did a quick training about Logstash on Udemy and I think i forgot about one important thing and didn't took a note. So I could use a brief explanation on what do I do wrong.
Based on the message content, I wanted to add a field called fingerprint, where value depends on whether message starts with <BATCH. My config looks like this:

input {
	file {
		path => "/location/*.xml"
		start_position => "beginning"
	}
}
filter {
	if [message] =~ "<BATCH.*$" {
		mutate {
			add_field => {"fingerprint" => "hoho"}
			}
		} else {
		mutate {
			add_field => {"fingerprint2" => "hihi"}
			}
		}
}
output {
	stdout {
	}
}

Here's a sample .xml file:

<?xml version="1.0" encoding="UTF-8"?>
<BATCH TIMESTAMP="2019-02-28T12:27:07.798+01:00" SYNTAX_REV="1.3">DATA</BATCH>

And the outcome in stdout looks like this:

{
"@version" => "1",
"message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r",
"@timestamp" => 2019-04-08T13:55:10.975Z,
"path" => "/location/1.xml",
"host" => "host_info"
}
{
"@version" => "1",
"message" => "<BATCH TIMESTAMP="2019-02-28T12:27:07.798+01:00" SYNTAX_REV="1.3">DATA</BATCH>",
"@timestamp" => 2019-04-08T13:55:11.975Z,
"path" => "/location/1.xml",
"host" => "host_info"
}

So no new field is added. How should i fix that? Should I first add something to message filed in filter? The output contains field message, so I'm not sure whether this should be done in the first step. Appreciate the help!

Try using forward slashes instead of quotes around your regex

/<BATCH.*$/

Also, I would avoid using the term fingerprint as there is a logstash plugin called fingerprint.

1 Like

Thanks for the tips. I changed the config to:
if [message] =~ "/<BATCH.*$/" {
but after starting the pipeline, i got an error:

[ERROR] 2019-04-09 10:59:51.114 [Converge PipelineAction::Create] agent - Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"SyntaxError", :message=>"(eval):80: syntax error, unexpected tGVAR\n if (((event.get("[message]") =~ //<BATCH.$//))) # if [message] =~ "/<BATCH.$/"\n", :backtrace=>["org/jruby/RubyKernel.java:994:in `eval'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:49:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:90:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:43:in `block in execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:94:in `block in exclusive'", "org/jruby/ext/thread/Mutex.java:148:in `synchronize'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:94:in `exclusive'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:39:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:327:in `block in converge_state'"]}

Yes, remove the " entirely

Worked like a charm. Thank you very much.

You're welcome!

One more question. If I'm trying to check whether a field contains specific word, I should use =~ "BATCH" (with quotes) but I want want to validate a regex like the one above, I should not use them?

=~ is used when you want to check with regular expression.
== is used when you want to compare something like int, string etc. Strings definitely need quotes around them.

There is no implicit anchoring in a regep. So having a trailing .*$ in it makes no sense. If you care whether it starts with that you need to anchor it yourself.

if [message] =~ /^<BATCH/ ...

Thanks @BKG & @Badger.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.