Logs not back filled if logstash indexer pipeline is down

Hi Everyone,

I am not able to figure out why logs are not back filled when indexer logstash is down although logs are buffered in kafka .

Regards
Arvind

A one-sentence question like this is impossible to answer. Have you configured Logstash to fetch logs from Kafka? And that stops working after Logstash has been down for a while?

Configs would help...

Sorry for incomplete information ..

My logging architecture is as follows

filebeat -> shipper ( beat as input and kafka as output) - > kafka <- indexer (kafka input + filter for access log parsing + elastic as output) <- kibana

Following are the configs ---

For Shipper

input {
beats {
host => "X.X.X.X"
port => 5044
congestion_threshold => 30
}
}

output {
if [type] == "access_log" {
kafka {
retry_backoff_ms => 30
linger_ms => 5
topic_id => "access_log"
bootstrap_servers => "X.X.X.X:9092,X.X.X.X:9092,X.X.X.X:9092"
acks => "0"
}
}
}

For Indexer

input {
kafka {
zk_connect => "X.X.X.X:2181,X.X.X.X:2181,X.X.X.X:2181"
group_id => "access_log"
topic_id => "access_log"
reset_beginning => true
consumer_threads => 5
consumer_restart_on_error => true
consumer_restart_sleep_ms => 100
decorate_events => true
}
}

filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
patterns_dir => ["/opt/logstash/grok/apache_access"]
}
date {
match => ["log_time" , "dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
locale => "en"
}
mutate {
remove_field => [ "log_time" , "auth" , "ident" , "beat" , "input_type" , "source"]
}
mutate {
gsub => [
"referrer", "^"", "",
"referrer", ""$", "",
"agent", "^"", "",
"agent", ""$", ""
]
}
useragent {
prefix => "agent_"
source => "agent"
remove_field => [ "agent_major", "agent_patch", "agent_build", "agent_minor", "agent_os_major", "agent_os_minor" ]
}
}

output {
elasticsearch {
index => "access_log-%{+YYYY.MM.dd}"
hosts => ["X.X.X.X:9210"]
template => "/opt/logstash/template/accesslog.json"
template_overwrite => true
template_name => "access_log-*"
workers => 2
}
}

f

Okay, and what's the problem?

Problem is , Sometime indexer pipeline freezes , in this case i have to restart indexer , So from the time indexer was freeze till restart there are no logs , but indexer should back fill logs from kafka

Is there anything in the Logstash logs when this happens?

There was nothing in logstash logs , It seems logstash is taking the last offset from zookeeper for the logs due to which old logs are not being fetched . not very sure about this .

You want to remove reset_beginning => true. That makes it delete your
offsets every time Logstash restarts.

Thanks

Joe_Lawson removing reset_beginning served my requirement .

:smiley: