Dear community,
I have a problem with slow writing to Elasticsearch from Logstash. I need to write 100K rows per second, but I get ~25K rows per second maximum. Please help me to find a bottleneck.
Servers.
I have two servers with 40 cores, 120 Gb RAM, RAID5 and 10 Gbit LAN. I have data node of ES, two master nodes of ES and one instance of Logstash per server.
10+ servers run Filebeat (one per server), start send huge log to Logstashes (Filebeat→Logstash→Elasticsearch) and I get maximum perfomance ~25K rows per second. During receiving logs disks is 20% busy, CPU is 80% idle on ELK side. JVMs use ~70% of heap.
When I tried to configure Logstash to store log in file (filebeat→logstash→file), I got 120+K rows per second. So, the problem is writing to Elasticsearch.
Versions and settings
filebeat 5.2.2
filebeat: queue.mem: events: 50000000 flush.min_events: 100000 flush.timeout: 1s prospectors: - paths: - /home/trump/server.log input_type: log document_type: serverlog registry_file: registry output: logstash: hosts: ["host1:5044", "host2:5044"] loadbalance: true ssl: certificate_authorities: ["/etc/filebeat/ca.crt"] certificate: "/etc/filebeat/client.crt" key: "/etc/filebeat/client.key"
logstash 6.2.1:
jvm:
-Xms5g -Xmx8g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.awt.headless=true -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError
logstash.conf:
input { beats { port => 5044 ssl => true ssl_certificate_authorities => ["/etc/logstash/certificates/ca.crt"] ssl_certificate => "/etc/logstash/certificates/server.crt" ssl_key => "/etc/logstash/certificates/server.key" } } filter { if [type] == "serverlog" { grok { patterns_dir => ["/etc/logstash/grok-patterns"] break_on_match => false match => [ "message", "\[%{NUMBER:pid}:%{BASE16NUM:thread}\]\[%{OURTIMESTAMP:timestamp} %{NOTSPACE}\] \[%{NOTSPACE}\] \(%{NOTSPACE}\): OpLog:read:%{NOTSPACE:user}\|%{GREEDYDATA:categories}" ] } date { match => [ "timestamp", "dd.MM.YYYY HH:mm:ss.SSSSSS" ] remove_field => [ "timestamp" ] } } } output { if [type] == "serverlog" { elasticsearch { hosts => ["localhost:9200"] index => "testlog-%{+YYYY.MM.dd}" document_type => "%{[@metadata][type]}" } } }
elasticsearch 6.2.1:
jvm data node:
-Xms30g -Xmx30g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=${ES_TMPDIR} -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch 8:-XX:+PrintGCDetails 8:-XX:+PrintGCDateStamps 8:-XX:+PrintTenuringDistribution 8:-XX:+PrintGCApplicationStoppedTime 8:-Xloggc:/var/log/elasticsearch/gc.log 8:-XX:+UseGCLogFileRotation 8:-XX:NumberOfGCLogFiles=32 8:-XX:GCLogFileSize=64m 9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m 9-:-Djava.locale.providers=COMPAT
elasticsearch.yml data node:
path.data: /srv/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true # Network configuration network.bind_host: 0.0.0.0 network.publish_host: 192.168.0.101 http.port: 9200 transport.tcp.port: 9300 # Cluster configuration cluster.name: logserver node.name: data1 discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.unicast.hosts: ["192.168.0.101:9300", "192.168.0.101:9301", "192.168.0.101:9302", "192.168.0.102:9300", "192.168.0.102:9301", "192.168.0.102:9302"] node.master: false node.data: true node.ingest: false search.remote.connect: false
jvm master node:
-Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -XX:+HeapDumpOnOutOfMemoryError
elasticsearch.yml data node:
path.logs: /var/log/elasticsearch #bootstrap.memory_lock: true # Network configuration network.bind_host: 0.0.0.0 network.publish_host: 192.168.0.101 http.port: 9201 transport.tcp.port: 9301 # Cluster configuration cluster.name: logserver node.name: master1 discovery.zen.minimum_master_nodes: 2 discovery.zen.ping.unicast.hosts: ["192.168.0.101:9300", "192.168.0.101:9301", "192.168.0.101:9302", "192.168.0.102:9300", "192.168.0.102:9301", "192.168.0.102:9302"] node.master: true node.data: false node.ingest: false # Search pool thread_pool.search.size: 20 thread_pool.search.queue_size: 100 # Bulk pool thread_pool.bulk.size: 33 thread_pool.bulk.queue_size: 300 # Index pool thread_pool.index.size: 20 thread_pool.index.queue_size: 100 # Indices settings indices.memory.index_buffer_size: 30% indices.memory.min_index_buffer_size: 96mb
Index settings:
"codec": "best_compression", "refresh_interval": "1m", "number_of_shards": "5",
How can I reach better performance? Thank you.