200 logs/minute elastic performance


#1

Hello!

I just started with the ELK stack few weeks ago and have spent time on the tutorial videos, hands-on, etc. I've elasticsearch-2.3.4 and logstash-2.3.4 on a Win7 PC (laptop) with 8GB RAM and 64-bit JDK1.8.

I'm trying a new way of storing my application logs (text files in custom format) and then analysing them with ELK - eventually all the application engineers will have access to the logs via the E or K interfaces. Just to try this out faster I'm initially playing with a 10MB apache log file with about 100,000 lines.

When I use logstash with the file as input and use the metrics filter and COMMONAPACHELOG grok filter and only output the metric to stdout (nothing else in the output), I get a rate_1m of around 3500.

When I add elasticsearch (running on same PC) to the output of logstash, the rate_1m drops to between 200 & 300. I tried configuring elasticsearch with mlockall, ES_HEAP_SIZE of 2G, 3G (4G failed to allocate) - the performance varies between 200 & 300 with no significant change with heap changes.

My questions:

  1. In my opinion, 3500 is poor performance of logstash. A perl script achieves a per minute performance of 6,000,000 on my machine for a similar match pattern and no output. Is this the performance I can expect for all the extra functionality offered by logstash?

  2. Assuming I'm OK with performance around 3500 for my application, 200 is just too much of a performance drop when I add elasticsearch to the mix. The PC is definitely not loaded when this is running, CPU hovers between 25 to 75% usage, total disk IO is below 10MB/s - the PC can easily do 30-40MB/s. So, system resources are not the bottleneck here. What am I doing wrong and how can I improve this rate?


(Ravi Shanker Reddy) #2

If you see the logs when starting up the ES will bring up on 4026 FD by default. Try to allocate more FD and try once.


#3

Doesn't look like FD is the problem. It shows -1 in /_nodes/stats/process. I think it means unlimited on Windows as per this post.

{
  "cluster_name": "elasticsearch",
  "nodes": {
    "FPJhPMnZQ8qy70QkczcUUA": {
      "timestamp": 1474685156386,
      "name": "James Dr. Power",
      "transport_address": "127.0.0.1:9300",
      "host": "127.0.0.1",
      "ip": [
        "127.0.0.1:9300",
        "NONE"
      ],
      "process": {
        "timestamp": 1474685156386,
        "open_file_descriptors": -1,
        "max_file_descriptors": -1,
        "cpu": {
          "percent": 11,
          "total_in_millis": 32136
        },
        "mem": {
          "total_virtual_in_bytes": 2369433600
        }
      }
    }
  }
}

#4

From here and the linked github bug, I understand that the rate I'm getting is around 200 logs/second and not 200 logs/minute as I wrote in the title.

I have also confirmed with various tests that the bottleneck is elasticsearch and not logstash. I'm now trying out this: https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html


(system) #5