I have two ELK stacks running - one in production, one in non-production. Each uses the identical config. I have numerous beats reporting into each.
When attempting to connect a new filebeat to the production instance of Logstash, I am receiving the following errors:
2019-07-11T15:07:33.473-0500 ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://logstash.example.com:5044)): dial tcp 10.21.23.200:5044: connectex: No connection could be made because the target machine actively refused it.
2019-07-11T15:07:41.984-0500 ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://logstash.example.com:5044)): dial tcp 10.21.23.200:5044: connectex: No connection could be made because the target machine actively refused it.
However, when I connect to the non-production instance, it connects as expected.
Ya - given that they're on separate subnets, I was thinking a possible firewall rule, but I have other beats on the same subnets reporting in as expected.
Firewall rules are OK. I started to experience this with another beat as well, which helped me to run this down more easily.
It appears that I wrote a bad filter rule, which is being applied to the logs generated from that beat. The filter survived a logstash -t check, but triggers a FATAL error when it is triggered. I was wrong about the configs being identical - I hadn't added it to my non-prod instance yet.
The rule is as follows:
grok {
match => {
"message" => [
"^%{DATESTAMP:[@metadata][_timestamp]}\s+%{TZ:[@metadata][_timezone]}>\s+%{LOGLEVEL:log.level}.*$"
]
}
tag_on_failure => []
}
date {
# Match line (below) appears to trigger the issue.
match => ["[@metadata][_timestamp] [@metadata][_timezone]",
"MM/dd/yyyy HH:mm:ss.SSS ZZZ"]
target => "@timestamp"
tag_on_failure => []
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.