I have placeholders in a log message. The message gets sent with a message extension which should replace these. The placeholders are numbers in brackets. The numbers inside should represent the line number of the message extension.
What I have:
message: "This is an [1] to show you the [2] I need for replacing these [3]."
message_extension: "example
filter configuration
placeholders"
What I want:
message: "This is an example to show you the filter configuration I need for replacing these placeholders."
There is a comma between each line so if there are multiple extensions there is alway a line with just a comma in between. Like that
example
,
filter configuration
,
placeholders
Sorry for not claryfying this earlier, I just noticed this too. This structure is constant. It would not be a problem for me, though, removing these commas so equaling the N from "[N]" with the line number from the field "message_extension" is possible.
This the best I got without resorting to a Ruby filter.
It fails if the order of substitutions is not strictly L -> R
Ignore the dissect filter, I just used it to get the two fields when using the generator input.
input {
generator {
count => 1
message => "This is an [1] to show you the [2] I need for replacing these [3].||example
,
filter configuration
,
placeholders" }
}
filter {
dissect {
mapping => {
message => '%{[msg]}||%{[msgext]}'
}
remove_field => "[message]"
}
mutate {
gsub => ["[msgext]", "\n,\n", "øåø", "[msg]", "\[\d\]", "øåø"]
split => {
"[msgext]" => "øåø"
"[msg]" => "øåø"
}
}
if [msgext][3] {
mutate {
add_field => { "[message]" => "%{[msg][0]}%{[msgext][0]}%{[msg][1]}%{[msgext][1]}%{[msg][2]}%{[msgext][2]}%{[msg][3]}%{[msgext][3]}%{[msg][4]}"}
}
} else if [msgext][2] {
mutate {
add_field => { "[message]" => "%{[msg][0]}%{[msgext][0]}%{[msg][1]}%{[msgext][1]}%{[msg][2]}%{[msgext][2]}%{[msg][3]}"}
}
} else if [msgext][1] {
mutate {
add_field => { "[message]" => "%{[msg][0]}%{[msgext][0]}%{[msg][1]}%{[msgext][1]}%{[msg][2]}"}
}
} else if [msgext][0] {
mutate {
add_field => { "[message]" => "%{[msg][0]}%{[msgext][0]}%{[msg][1]}"}
}
}
}
output { stdout { codec => rubydebug } }
Gives:
{
"msg" => [
[0] "This is an ",
[1] " to show you the ",
[2] " I need for replacing these ",
[3] "."
],
"sequence" => 0,
"@timestamp" => 2018-07-23T13:04:05.677Z,
"msgext" => [
[0] "example",
[1] "filter configuration",
[2] "placeholders"
],
"@version" => "1",
"host" => "Elastics-MacBook-Pro.local",
"message" => "This is an example to show you the filter configuration I need for replacing these placeholders."
}
Thank you very much. Let me just understand the procedure. So with the mutate filter you are transforming both fields into an array , right?
What does "øåø" actually mean? Just a mark where to split?
I have other indexes where the field message_extension already is an array. This would mean I would have to comment out the mutate filter, correct? However, the if condition never seems to be true then because no field is added.
Yeah. It is a quirk. The mutate split does not take a regex arg but gsub does so one has to resort to using gsub to replace the (possibly varying) characters with a known but improbable (as in the log data) invariant string øåø (you can have fun with other multibyte glyphs from UTF 8).
Then you can split on that invariant.
A Ruby filter can do this in a few lines if you prefer - I can help with that - it can also check whether message_extension is an array or not.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.