The filter is working i.e. ipv4_(src|dst)_hostname, which initally contains the IP addresses, are replaced by the appropriate hostnames if available. However, my redis list, which is fed by a local logstash instance, and then consumed by the 4 separate logstash nodes+indexers, starts growing (normally llen reports 0), and never reduces back to 0.
If I remove this filter, the list dramatically is consumed back to 0 again.
I've tried the following to see whether I could attenuate this behaviour:
increase logstash worker threads -w 2 or 4 (default is 1)
installed dnsmasq with:
cache-size=6000
dns-forward-max=600
all-servers
doing a host -v resolves to hostname from 127.0.0.1#53 in 0 ms on average.
in case it was an issue resolving external IPs, I tried only do a reverse lookup for RFC1918 i.e.
if [ipv4_src_addr] =~ "(^127.0.0.1)|(^10.)|(^172.1[6-9].)|(^172.2[0-9].)|(^172.3[0-1].)|(^192.168.)" {
dns {
reverse => [ "ipv4_src_hostname" ]
action => "replace"
}
}
All to no avail.
Is there a known issue with the DNS filter doing reverse DNS lookups and having poor performance, or am I seriously doing something wrong?
[EDIT] I am using logstash 1.5 and elasticsearch 1.6. My architecture is:
The slowness is mentioned in the filter documentation. With netflow data, I'd recommend you avoid doing DNS lookups in Logstash, store the raw IP and only do the lookups when you are analysing the data on the filtered dataset you are looking at.
Thanks Joshua. After more testing over the weekend, DNS lookup for local hosts is working fast enough for me to use that portion, so I will stick with that for now, and preclude external hosts.
Perhaps a batch process that queries ES for non-resolved hostnames and updates them once/day might be the way to go for these external hostnames, which I will assume may not change that often.
You could always process these log lines before making available to
Logstash...though a scrubbing process which changes IPs to fqdn (or even
better, updates a separate field) might be better, as you'd have the data
more readily available at first.
Hmm, to pre-process, it'll have to be somewhere between input -> consumption in my pipeline, marked #[1-3] in the 3 places below:
netflow producers -> (logstash #1-> redis #2) #3-> logstash4 -> elasticsearch4
Ideally it would be logstash doing that work, since it's already reading it either at #1 or #3, but the reverse DNS lookup is very slow, and I'm uncertain if that's a logstash problem, or simply just the lookup latency. As a point to note, when I changed the DNS lookup to be at #1 in the pipeline, the number of netflow packets captured was only ~20% of expected, so I can guesstimate the slowdown to be ~5x (i.e. about 80% of packets ended up being dropped at the UDP input phase).
Which means likely I need to batch the operation on the redis side; simplistically 2 lists, e.g.
logstash:redis:rawips
logstash:redis:hostnamesresolved
and a script to read the logstash:redis:rawips list for each unique raw IP in turn and reindex all associated items into the logstash:redis:hostnamesresolved list.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.