I've been butting heads with setting up my ELK stack inside docker. At this point I've traced the point of fault down to logstash and currently believe it is either the pipeline or something simple that one with more knowledge than I would know.
If Logstash really is receiving events you'll see stuff in the logs if you have debug-level logging enabled. Exactly how are you sending stuff to Logstash? Have you tried stepping into the Logstash container with docker exec and send stuff from there?
I have, thanks for asking. That's where I'm running the tcpdump to capture inputs and outputs, which revealed that while the container for logstash is getting syslog packets from the networking equipment and rfc compliant syslog servers logstash itself is not opening a connection out to elasticsearch. Tailing logs in both containers shows the same with logstash logs returning a sequence of messages:
Starting pipeline
Pipeline main started
Flushing buffer at interval (repeats continuously)
Pushing flush into pipeline (at random iterations, averaging around 8-12 of the previous "flushing buffer" message)
Elasticsearch access log only shows activity from kibana and telegraf.
UPDATE:
Just for clarity I'm providing the tcpdump results here in thread to compliment what was documented on the github gist posted earlier:
logstash# tcpdump -Xni eth0 -A -vv "(port 9200 or 9300 or 80 or 443)"
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
logstash# tcpdump -Xni eth0 -A -vv -w /dev/null "(port 1514 or 514)"
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
1507 packets captured
1507 packets received by filter
0 packets dropped by kernel
current pipeline config:
#####
###
#
input {
file {
sincedb_path => "/dev/null"
start_position => "beginning"
}
snmptrap {
community => "public"
host => "0.0.0.0"
port => 162
type => "snmptrap"
}
# syslog {
# port => 1514
# type => "syslog"
# }
udp {
port => 1514
type => syslog
}
tcp {
port => 1514
type => syslog
}
}
filter {
if "snmptrap" in [type] {
grok {
match => [ "message", "%{SYSLOGBASE} (?<header>[^\t]+)(?:\t%{GREEDYDATA:varbind})?" ]
}
date {
match => [ "timestamp", "MMM dd HH:mm:ss", "MMM d HH:mm:ss", "ISO8601" ]
}
if [varbind] {
kv {
field_split => "\t"
value_split => "="
trim => " \""
trimkey => " "
source => [ "varbind" ]
target => "vb"
}
ruby {
code => "event.remove('vb').each { |k,v| event[k.gsub(/^.*::([^\.]+)\..*$/, '\1')] = v }"
remove_field => "varbind"
}
}
}
if "syslog" in [type] {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "recieved_at", "%{@timestamp}" ]
add_field => [ "recieved_from", "%{host}" ]
}
date {
match => [ "timestamp", "MMM dd HH:mm:ss", "MMM d HH:mm:ss", "ISO8601" ]
}
}
# Merge stacktraces and Jersey logs
multiline {
pattern => "^\s+|^Caused by:|^%{JAVACLASS:class}:|^%{NUMBER} < |^%{NUMBER} > [^GET|POST|PUT|DELETE|PATCH]"
what => "previous"
}
# Parse logback messages
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:ts}\s+\[%{GREEDYDATA:thread}\]\s+%{WORD:level}\s+%{JAVACLASS:class}" }
add_field => { "subType" => "java" }
remove_tag => ["_grokparsefailure"]
}
# InfluxDB
grok {
match => { "message" => "\[%{WORD:subsystem}\] (?<ts>[0-9]{4}/[0-9]{2}/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2})" }
add_field => { "subType" => "influxdb" }
remove_tag => ["_grokparsefailure"]
}
date {
match => [ "ts", "YYYY-MM-dd HH:mm:ss,SSS", "YYYY/MM/dd HH:mm:ss" ]
# Use the log timestamp to get sub-second precision (useful for ordering)
target => "@timestamp"
# Remove the ts field as it confuses Elasticsearch (dynamic mapping misses some date formats)
remove_field => [ "ts" ]
}
}
output {
if "snmptrap" in [type] {
syslog {
host => "localhost"
port => "1514"
facility => "security/authorization"
severity => "informational"
}
}
if "syslog" in [type] {
elasticsearch {
hosts => ["escluster:9200"]
sniffing => true
ssl => false
ssl_certificate_verication => false
}
}
elasticsearch {
hosts => ["escluster:9200"]
sniffing => true
ssl => false
ssl_certificate_verification => false
}
stdout {
codec => rubydebug
}
file {
path => "/var/log/logstash.log"
codec => "plain"
}
}
##
# vim:et:si:ts=4:sts=4:sw=4:
After completely wiping all the containers on the engine node we did a rebuild and made sure the versions pulled down where using 5.5.4 of each part of the stack (Elastic, Logstang, Kibana). This resolved the poor logging we were seeing and helped us resolve the pipeline issues.
I'd strong advise that anyone else that reads this form do the same as well to make sure they have the latest version installed.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.