Insufficient throughput from Filebeat

Hi,

Are you using network shares or are these files all local? Have you checked all files being actually forwarded? Can you share your filebeat config file?

I'd like to run some tests first. Unfortunately you will need 1.1 for these tests, as we started to insert some counters in critical places.

https://download.elastic.co/beats/filebeat/filebeat_1.1.0-SNAPSHOT_amd64.deb
https://download.elastic.co/beats/filebeat/filebeat-1.1.0-SNAPSHOT-x86_64.tar.gz
https://download.elastic.co/beats/filebeat/filebeat_1.1.0-SNAPSHOT_i386.deb
https://download.elastic.co/beats/filebeat/filebeat-1.1.0-SNAPSHOT-x86_64.rpm
https://download.elastic.co/beats/filebeat/filebeat-1.1.0-SNAPSHOT-darwin.tgz
https://download.elastic.co/beats/filebeat/filebeat-1.1.0-SNAPSHOT-i686.tar.gz
https://download.elastic.co/beats/filebeat/filebeat-1.1.0-SNAPSHOT-i686.rpm
https://download.elastic.co/beats/filebeat/filebeat-1.1.0-SNAPSHOT-windows.zip

Here is a python script to collect these counters and compute rates:

import requests
import argparse
import time
import curses


def main():
    parser = argparse.ArgumentParser(
        description="Print per second stats from expvars")
    parser.add_argument("url",
                        help="The URL from where to read the values")
    args = parser.parse_args()

    stdscr = curses.initscr()
    curses.noecho()
    curses.cbreak()

    last_vals = {}

    # running average for last 30 measurements
    N = 30
    avg_vals = {}
    now = time.time()

    while True:
        try:
            time.sleep(1.0)
            stdscr.erase()

            r = requests.get(args.url)
            json = r.json()

            last = now
            now = time.time()
            dt = now - last

            for key, total in json.items():
                if isinstance(total, (int, long, float)):
                    if key in last_vals:
                        per_sec = (total - last_vals[key])/dt
                        if key not in avg_vals:
                            avg_vals[key] = []
                        avg_vals[key].append(per_sec)
                        if len(avg_vals[key]) > N:
                            avg_vals[key] = avg_vals[key][1:]
                        avg_sec = sum(avg_vals[key])/len(avg_vals[key])
                    else:
                        per_sec = "na"
                        avg_sec = "na"
                    last_vals[key] = total
                    stdscr.addstr("{}: {}/s (avg: {}/s) (total: {})\n"
                                  .format(key, per_sec, avg_sec, total))
            stdscr.refresh()
        except requests.ConnectionError:
            stdscr.addstr("Waiting for connection...\n")
            stdscr.refresh()
            last_vals = {}
            avg_vals = {}

if __name__ == "__main__":
    main()

The counters are queried via http. I assume you don't want to have an open port, so we make the beat to bind the port to localhost only (the python script must be run locally in this case).

run.sh:

#!/bin/sh
REGISTRY=.filebeat
rm -f $REGISTRY
filebeat -e -httpprof 127.0.0.1:6060 -c $1

The -httpprof 127.0.0.1:6060 creates a small http server for exposing some counters. The script awaits the configuration file to run filebeat with.

Start the python script like this (can be active all the time):

$ python expvar_rates.py http://localhost:6060/debug/vars

There are quite some variables in here. I'd like to reduce them first to get an idea of I/O. To do so, get your config file and comment out the logstash output plugin. Instead enable the console output plugin. Plus, reconfigure the registry path (and adapt the shell script), so you don't overwrite your actual registry.

When having console output it's a good idea to call run.sh like this:

$ ./run.sh test.yml > /dev/null

Consider doing the test (without sending to /dev/null) with only one prospector configured at a time, to eliminate the case log files are missing due to faulty config.

Doing this test with Nasa HTTP logs I get about 75k lines/s with default configuration.