DNS filter performance issue?


(Yu-Phing) #1

I'm trying to use the DNS filter to do reverse DNS lookups for various IPs in netflow data, like so:

dns {
  reverse => [ "ipv4_src_hostname", "ipv4_dst_hostname" ]
  action => "replace"
}

The filter is working i.e. ipv4_(src|dst)_hostname, which initally contains the IP addresses, are replaced by the appropriate hostnames if available. However, my redis list, which is fed by a local logstash instance, and then consumed by the 4 separate logstash nodes+indexers, starts growing (normally llen reports 0), and never reduces back to 0.

If I remove this filter, the list dramatically is consumed back to 0 again.

I've tried the following to see whether I could attenuate this behaviour:

  1. increase logstash worker threads -w 2 or 4 (default is 1)

  2. installed dnsmasq with:

    cache-size=6000
    dns-forward-max=600
    all-servers

doing a host -v resolves to hostname from 127.0.0.1#53 in 0 ms on average.

  1. in case it was an issue resolving external IPs, I tried only do a reverse lookup for RFC1918 i.e.

    if [ipv4_src_addr] =~ "(^127.0.0.1)|(^10.)|(^172.1[6-9].)|(^172.2[0-9].)|(^172.3[0-1].)|(^192.168.)" {
    dns {
    reverse => [ "ipv4_src_hostname" ]
    action => "replace"
    }
    }

All to no avail.

Is there a known issue with the DNS filter doing reverse DNS lookups and having poor performance, or am I seriously doing something wrong?

[EDIT] I am using logstash 1.5 and elasticsearch 1.6. My architecture is:

netflow producers -> (logstash -> redis) -> logstash*4 -> elasticsearch*4

(Joshua Rich) #2

The slowness is mentioned in the filter documentation. With netflow data, I'd recommend you avoid doing DNS lookups in Logstash, store the raw IP and only do the lookups when you are analysing the data on the filtered dataset you are looking at.


(Yu-Phing) #3

Thanks Joshua. After more testing over the weekend, DNS lookup for local hosts is working fast enough for me to use that portion, so I will stick with that for now, and preclude external hosts.

Perhaps a batch process that queries ES for non-resolved hostnames and updates them once/day might be the way to go for these external hostnames, which I will assume may not change that often.


(Norberto Meijome) #4

You could always process these log lines before making available to
Logstash...though a scrubbing process which changes IPs to fqdn (or even
better, updates a separate field) might be better, as you'd have the data
more readily available at first.


(Joshua Rich) #5

Yeah, pre or post-processing of the data to do the DNS lookups sounds like the best option here.


(Yu-Phing) #6

Hmm, to pre-process, it'll have to be somewhere between input -> consumption in my pipeline, marked #[1-3] in the 3 places below:
netflow producers -> (logstash #1-> redis #2) #3-> logstash4 -> elasticsearch4

Ideally it would be logstash doing that work, since it's already reading it either at #1 or #3, but the reverse DNS lookup is very slow, and I'm uncertain if that's a logstash problem, or simply just the lookup latency. As a point to note, when I changed the DNS lookup to be at #1 in the pipeline, the number of netflow packets captured was only ~20% of expected, so I can guesstimate the slowdown to be ~5x (i.e. about 80% of packets ended up being dropped at the UDP input phase).

Which means likely I need to batch the operation on the redis side; simplistically 2 lists, e.g.
logstash:redis:rawips
logstash:redis:hostnamesresolved

and a script to read the logstash:redis:rawips list for each unique raw IP in turn and reindex all associated items into the logstash:redis:hostnamesresolved list.

My head hurts; any volunteers?


(system) #7