I recently upgraded my elastic Stack to 6.x.Since 2 days I noticed a weird behaviour which results in a significant amount of events being dropped.
I have a DNS filter which does a reverse lookup on all internal IPs which makes it easier to search for specific machines. After the upgrade logstash runs for ~15 minutes before the logstash.log gets flooded with DNS request timeouts. I added a cache option which improved the situation quiet a bit but after two hours the same problems started to arise. My question is: Is anybody experiencing something similiar? Is there something I am missing?
- After the Upgrade
- After setting a cache:
This is the filter in question:
filter {
mutate {
rename => { "dstport" => "dst_port" }
rename => { "srcport" => "src_port" }
}
if [src_ip] {
if [src_ip] =~ /10\..*/ {
mutate {
add_field => { "src_ip_resolve" => "%{src_ip}"}
}
dns {
action => "replace"
reverse => ["src_ip_resolve"]
nameserver => ["10.0.1.21", "10.0.1.22"]
hit_cache_size => 8000
hit_cache_ttl => 900
failed_cache_size => 1000
failed_cache_ttl => 300
}
} else {
geoip {
source => "src_ip"
target => "src_ip_geo"
fields => ["city_name","country_name"]
}
}
}
if [dst_ip] {
if [dst_ip] =~ /10\..*/ {
mutate {
add_field => { "dst_ip_resolve" => "%{dst_ip}"}
}
dns {
action => "replace"
reverse => ["dst_ip_resolve"]
nameserver => ["10.0.1.21", "10.0.1.22"]
hit_cache_size => 8000
hit_cache_ttl => 900
failed_cache_size => 1000
failed_cache_ttl => 300
}
} else {
geoip {
source => "dst_ip"
target => "dst_ip_geo"
fields => ["city_name","country_name"]
}
}
}