Hi,
I have filebeat running on 10 hosts that send logs to a central logstash.
I notice these messages on all of them. The time on different hosts does not coincide suggesting there is no specific window for these failures.
2017-12-20T05:30:57+05:30 ERR Failed to publish events caused by: read tcp 10.240.172.68:49774->10.219.27.74:5536: i/o timeout 2017-12-20T05:30:57+05:30 INFO Error publishing events (retrying): read tcp 10.240.172.68:49774->10.219.27.74:5536: i/o timeout 2017-12-20T05:31:19+05:30 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_errors=1 libbeat.logstash.publish.read_bytes=54 libbeat.logstash.published_but_not_acked_events=103 libbeat.logstash.call_count.PublishEvents=6 libbeat.publisher.published_events=941 publish.events=10836 libbeat.logstash.publish.write_bytes=49297 registrar.writes=6 libbeat.logstash.published_and_acked_events=1044 registrar.states.update=10836
and
2017-11-15T05:17:49-05:00 ERR Failed to publish events caused by: write tcp 10.219.26.81:54940->10.219.27.74:5536: write: connection reset by peer
2017-11-15T05:17:49-05:00 INFO Error publishing events (retrying): write tcp 10.219.26.81:54940->10.219.27.74:5536: write: connection reset by peer
2017-11-15T05:17:59-05:00 INFO Non-zero metrics in the last 30s: registrar.states.update=88064 libbeat.logstash.publish.write_bytes=195420 libbeat.publisher.published_events=4602 registrar.writes=43 libbeat.logstash.call_count.PublishEvents=44 libbeat.logstash.publish.read_bytes=270 publish.events=88064 libbeat.logstash.publish.write_errors=1 libbeat.logstash.published_but_not_acked_events=82 libbeat.logstash.published_and_acked_events=4602
2017-11-15T05:18:29-05:00 INFO Non-zero metrics in the last 30s: registar.states.current=-1 libbeat.logstash.published_and_acked_events=4680 registrar.states.update=75776 registrar.writes=37 libbeat.logstash.publish.write_bytes=196788 libbeat.publisher.published_events=4680 libbeat.logstash.publish.read_bytes=228 publish.events=75776 registrar.states.cleanup=1 libbeat.logstash.call_count.PublishEvents=37
Also my logstash server is not heavily loaded. I don't see it run out of any cpu/mem/disk resources.
Here is my filter in logstash,
filter {
if ([type] == "named-externalqueries") {
grok {
match => [ "message", "(?<parsedtime>%{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME}) queries: info: client %{IPORHOST:clientIP}#%{NUMBER:clientPort:int}%{SPACE}\(%{DATA:queryName}\): query: %{DATA:queryName2} %{WORD:queryClass} %{WORD:queryType} (?<recursive>[+-])(?<queryFlags>[SETDC]*) \(%{IPORHOST:nameserver}\)", "message", "(?<parsedtime>%{MONTHDAY}-%{MONTH}-%{YEAR} %{TIME}) queries: info: client %{IPORHOST:clientIP}#%{NUMBER:clientPort:int}%{SPACE}\(%{DATA:queryName}\): view %{WORD:queryView}: query: %{DATA:queryName2} %{WORD:queryClass} %{WORD:queryType} (?<recursive>[+-])(?<queryFlags>[SETDC]*) \(%{IPORHOST:nameserver}\)" ]
}
ruby {
code => "
if !event.get('queryFlags').to_s.empty?
if event.get('queryFlags').include? 'S'
event.tag('queryFlags_signed')
end
if event.get('queryFlags').include? 'E'
event.tag('queryFlags_edns0')
end
if event.get('queryFlags').include? 'T'
event.tag('queryFlags_tcp')
end
if event.get('queryFlags').include? 'D'
event.tag('queryFlags_dnssec')
end
if event.get('queryFlags').include? 'C'
event.tag('queryFlags_dc')
end
end
if event.get('queryType').include? 'PTR'
ip=event.get('queryName').match(/.*?((?:[0-9]{1,3}\.){4}).*/)[1].chomp('.').split('.').reverse.join('.')
event.set('queryIP',ip)
end
"
}
if [queryType] =~ "PTR" {
cidr {
add_tag => [ "_internal_ptr_lookup" ]
address => [ "%{queryIP}" ]
network => [ "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16" ]
}
}
if "_internal_ptr_lookup" in [tags] {
drop {}
}
# At times, query name contains internal ip addresses as well (domains are already dropped in filebeat), drop these
if [queryName] =~ /^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/ {
cidr {
add_tag => [ "_internal_lookup" ]
address => [ "%{queryName}" ]
network => [ "10.0.0.0/8", "172.16.0.0/12", , "192.168.0.0/16" ]
}
}
if "_internal_lookup" in [tags] {
drop {}
}
}
I think I am loosing some events from Filebeat. Do these messages confirm the same ?
Is there a document detailing what these mean ?
libbeat.logstash.publish.read_errors
libbeat.logstash.published_but_not_acked_events
registar.states.current=-1
Please assist.
Regards,
Ahmad