How to improve Filebeat -> Elastic performance & reduce Elastic store size ? got ~9433 logs per second


(Vramakrishnan) #1

I have the following setup on MAC running on i7 (8core machine).

Filebeat (reading local /tmp/perf/.. logs) --> Elastic single instance docker

I was only able to get ~9400 logs/sec indexed to Elastic and also the store.size for 200MB log data is 1.1GB, roughly 4.5x overhead. The configs are listed at the end. I am sure my configs are not optimal and would like to hear from the community how they tune the configs for performance and efficient storage in elastic (a trade off).

  1. How can I improve the throughput for this test case in dev environment ? . I have configured spool size, bulk_max_size, workers etc. (This is not production like config, but would like understand the constraints and perf on this setup, before I can scale it up with client/data/master nodes setup)

  2. And how can I reduce the Elastic store size 1.1G i.e. 4.5x overhead for 200M log data.

Any pointers would really help.

Total Logs exported : Log size 200MB, (which has 2000000 log lines total)

[perf] $ cd /tmp/perf/ ; ls -ltr
total 460512
-rw-r--r-- 1 xyz wheel 117888896 Jul 28 16:02 nphal.log
-rw-r--r-- 1 xyz wheel 117888896 Jul 28 16:02 npagent.log

Time taken to index 2M log records (200MB in total size)

[perf] $ time watch curl http://127.0.0.1:9200/_cat/indices?v

real 3m32.463s
user 0m0.733s
sys 0m0.697s

Time to index 2M entries : 9433 logs/second

2M logs indexed in Elastic

[perf] $ curl http://127.0.0.1:9200/_cat/indices?v | grep agent
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1000 100 1000 0 0 69405 0 --:--:-- --:--:-- --:--:-- 71428
yellow open agent.logs.2017.07.28 twGpthLpRZKe5-pGO0YlLw 5 1 2000000 0 1.1gb 1.1gb

Filebeat config

filebeat.prospectors:

  • input_type: log
    paths:

    • /tmp/perf/npagent.log
      symlinks: true
      scan_frequency: 500ms
  • input_type: log
    paths:

    • /tmp/perf/nphal.log
      symlinks: true
      scan_frequency: 500ms

filebeat.spool_size: 65536

output.elasticsearch:
worker: 8
bulk_max_size: 4096
hosts: ["127.0.0.1:9200"]
index: "agent.logs.%{+YYYY.MM.dd}"

ELASTIC docker

services:
elastic:
image: docker.elastic.co/elasticsearch/elasticsearch:5.4.1
container_name: elastic
environment:
- ES_JAVA_OPTS=-Xms1g -Xmx1g -Xmn500m -XX:MaxMetaspaceSize=500m
mem_limit: 1g
ports:
- 9200:9200

I also enable memory and cpu profile for filebeat. Most of the time is spend in json encoder and runtime.malloc.

And mem profiling shows the following -

$ go tool pprof -inuse_space filebeat /tmp/perf/mem.txt
Entering interactive mode (type "help" for commands)
(pprof) list
Total: 37.37MB

# runtime.MemStats
# Alloc = 39052504
# TotalAlloc = 25032945592
# Sys = 610914448
# Lookups = 100
# Mallocs = 320146182
# Frees = 320135121
# HeapAlloc = 39052504
# HeapSys = 566919168
# HeapIdle = 525926400
# HeapInuse = 40992768
# HeapReleased = 0
# HeapObjects = 11061
# Stack = 1212416 / 1212416
# MSpan = 85280 / 9371648
# MCache = 9600 / 16384
# BuckHashSys = 1475739
# NextGC = 75925642


(Mark Walkom) #2

Have you customised the mappings in Elasticsearch?
Have you issued a force merge on the index?
Did you enable best compression on the index?


(Vramakrishnan) #3

Hi Mark

Thanks for the response.

I attempted to configure mappings, but I have to admit I am newbie to elastic and any config pointers would help for the use case below.

The logs messages are of the format for testing. (log context with msg-random string generated)
The key attributes used for querying are module, level with option to do full text search on the msg field.

{"caller":"log.go:124","level":"info","module":"Npagent","msg":"1 data-oTNLSNOyAclw7f3YRyuP0tznwpMKL_vbOsLFHMglKTNLNYV2Gf2tyFp88nhsRVI42WS3YoNFOwIjOPQTCboRqA==","pid":"809","ts":"2017-07-31T06:18:34.040851534Z"}
{"caller":"log.go:124","level":"info","module":"Npagent","msg":"2 data-1Z10GCByPE_teIVDAg9vDOj-0ZdZiAmZzb3Ra0_SrJLtSA-gzSSIJ6rYwAHQy7I1RwAJD98CPVwH9KP1tm3WXw==","pid":"809","ts":"2017-07-31T06:18:34.041119998Z"}

Can you please suggest a right mapping for this use case.

Regarding other setting, I assume these are the ones you are suggesting.

  • index.codec: best_compression

  • curl -XPOST 'localhost:9200/my_index/_forcemerge?max_num_segments=5' , Is this good enough to test ?

  • how about index refresh interval - should I set it to 10s or more ?


(Mark Walkom) #4

You should be using the appropriate data types for each field, eg int, timestamp etc and use keyword as much as possible.

Make sure you merge to a single segment.

Also look at your heap, with 1GB it's not going to be super fast.


(Vramakrishnan) #5

Hi Mark

I ran the tests with modified configs suggested.

  • with force merge to 1 segment,
  • java heap size to 2G (doubled) and beyond that it started degrading
  • created index mapping for the fields
  • also configured codec to best_compression. (but it is not showing up in GET _settings, if I have mapping enabled), otherwise I see codec setting show up.
  • without mapping, when I enabled best_compression, storage savings was only around 100MB (reduced from 1.1G to 1G - for 200M log raw data). With compression what should be the expected % savings in my case ?

It improved from 9433 logs/sec to 13605 logs/sec. But I was hoping to get much better thruput.

Looks like I am missing something here. Any pointers are appreciated.

logfmt

level=info msg="1 data-nN3xS7dpe-mFVdNdToQZQC1jbGmx0c1uxlRLrM_JgSeZNKptthy3EBm2JiMZHoBK6aU-nb46Cit8AlPx-vx4vQ=="
level=info msg="2 data-d_Ihmpm968s5P2Mw31xwqHHtx47BCrfowubA12EXehXBtMJ5wXrH5a9DGy6JhgqnrqNKYm4Y3UopcTdwP-WJEw=="

created index with following settings

{
"index": {
"codec": "best_compression"
},
"mappings": {
"logs": {
"properties": {
"level" : { "type" : "string", "index" : "not_analyzed" },
"msg" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}

get settings

    "settings": {
        "index": {
            "creation_date": "1501574742061",
            "number_of_shards": "5",
            "number_of_replicas": "1",
            "uuid": "4QsYMsl8Q3-0MXZtXOQQkg",
            "version": {
                "created": "5050099"
            },
            "provided_name": "agent.logs"
        }
    }

Ran Elastic single node docker with this config

version: '2'
services:
elasticsearch1:
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.0
container_name: elasticsearch1
environment:
- cluster.name=elastic-cluster
- bootstrap.memory_lock=true
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
mem_limit: 4g
cap_add:
- IPC_LOCK
volumes:
- esdata1:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- esnet


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.