Hi,
I'm reading some weird log files and managed to extract the information I want using grok and kv. However, at the end I want to replace the message field with a simpler message, using fields from the grok plugin. When I use the file input for tests, everything works. When I switch the input to filebeats, the replace does not work and the message gets the literal names of the fields used.
A log entry example:
2019-05-15 23:41:50, ==================================================================================
2019-05-15 23:41:50, - Server sent a new telegram ---
2019-05-15 23:41:50, ---- Telegramtype TEST_TELEGRAM--- 599 ---
2019-05-15 23:41:50, ---- Returkod = 0 ---
2019-05-15 23:41:50, ---- Finished TEST_TELEGRAM ---
Logstash config:
# Filebeats -> Logstash -> Elasticsearch pipeline.
input {
file {
path => "c:/users/myuser/Downloads/ELK/7.2.0/logstash-7.2.0/bin/logstash-tests2.log"
sincedb_path => "nul"
start_position => "beginning"
codec => multiline {
pattern => "^%[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}, %{WORD:}"
negate => true
what => "previous"
}
}
# beats {
# port => 5044
# }
}
filter {
# ECS. Copy the original message to the log.original field
mutate {
copy => { "message" => "log.original" }
}
# Extract timestamp
grok {
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:timestamp}%{GREEDYDATA:message}" ]
overwrite => [ "message" ]
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
}
# Clean up
mutate {
gsub => [
"message", "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}", "",
"message", "^, ---- ", "",
"message", "^, - ", "",
"message", " ---", "---",
"message", "\r\n", "",
"message", "^, ==================================================================================", "",
"message", " = ", "=",
"message", "HEADERDATA_UT: ", ""
]
}
# Trim message
mutate {
strip => [ "message" ]
}
# Get information to be used later in the message header (message + telegram type + number)
grok {
match => { "message" => "%{DATA:[@metadata][msg1]}%{SPACE}---Telegramtype %{WORD:[@metadata][telegram_type]}%{SPACE}(?<dummy>\-+)%{SPACE}%{NUMBER:[@metadata][telegram_num]}%{GREEDYDATA}"}
}
# Breakdown key-value pairs
kv {
field_split_pattern => "---"
target => "kv"
transform_key => "lowercase"
remove_char_key => "\ (\-)+"
}
# Replace the big message with a smaller one (message header). Return code will be hopefuly extracted as key-value pair (kv.returkod key) from the message.
mutate {
replace => { "message" => "%{[@metadata][msg1]} - Telegram type: %{[@metadata][telegram_type]}; Telegram number: %{[@metadata][telegram_num]}; Return Code: %{[kv][returkod]}" }
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
}
stdout {
codec => rubydebug
}
}
The resulting message field when the input is file is:
May 15, 2019 @ 23:41:50.000 Server sent a new telegram - Telegram type: TEST_TELEGRAM; Telegram number: 599; Return Code: 0
But when I switch to filebeat as the input to logstash, I get this in Kibana:
May 15, 2019 @ 23:41:50.000 %{[@metadata][msg1]} - Telegram type: %{[@metadata][telegram_type]}; Telegram number: %{[@metadata][telegram_num]}; Return Code: 0
This is driving me crazy, I'm on it for days already and no clues about what could be happening. Any ideas are very appreciated!
Thanks!