Degradation performance bulk insert

swood · June 13, 2015, 10:33am

Hello.

I use ElasticSearch for collection applications log. But it's huge logs. I have 17 physical servers and 8 node on each servers. Each servers has 4 disks.
My config looks like this:

discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.timeout: 5s
discovery.zen.ping.unicast.hosts: ["192.168.3.23:9300"]
gateway.expected_nodes: 2
gateway.recover_after_nodes: 3
gateway.recover_after_time: 5m
gateway.type: local
index.indexing.slowlog.threshold.index.debug: 2s
index.indexing.slowlog.threshold.index.info: 5s
index.indexing.slowlog.threshold.index.trace: 500ms
index.indexing.slowlog.threshold.index.warn: 10s
index.number_of_replicas: 2
index.number_of_shards: 4
index.search.slowlog.threshold.fetch.debug: 500ms
index.search.slowlog.threshold.fetch.info: 800ms
index.search.slowlog.threshold.fetch.trace: 200ms
index.search.slowlog.threshold.fetch.warn: 1s
index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.query.trace: 500ms
index.search.slowlog.threshold.query.warn: 10s
monitor.jvm.gc.young.debug: 400ms
monitor.jvm.gc.young.info: 700ms
monitor.jvm.gc.young.warn: 1000ms
network.bind_host: 192.168.3.7
network.publish_host: 192.168.3.7
node.data: true
node.master: true
node.name: "server_1"
path.data: /var/www/elastic,/var/www/elastic,/var/www/elastic,/var/www/elastic
path.logs: /var/log/elasticsearch
#
transport.tcp.port: 9300
http.port: 9200
cluster.routing.allocation.disk.watermark.low: 1gb
cluster.routing.allocation.disk.watermark.high: 500mb
cluster.routing.allocation.node_concurrent_recoveries: 4
cluster.routing.allocation.node_initial_primaries_recoveries: 8
indices.recovery.concurrent_streams: 8
indices.recovery.max_bytes_per_sec: 100mb
threadpool.bulk.queue_size: 50000
index.query.default_field: host
refresh_interval: 1m
index.translog.interval: 30
index.translog.flush_threshold_ops: 50000
index.translog.flush_threshold_size: 512m
indices.memory.index_buffer_size: 30%
index.store.type: mmapfs

Each node has using own path on all disks.
For insert of data I used to Logstash. For transport data from my servers to Logstash I used to logstash_forwarder with spool-size=512.

When cluster has not much nodes all works fine. But, when cluster is increased, Logstash begins to wait ES with next messages:

"Failed to flush outgoing items", :outgoing_count=>4872, :exception=>java.lang.OutOfMemoryError: Java heap space, :backtrace=>[], :level=>:warn}"

But, my attempts increases memory for Logstash have not been successful.
Maybe I have a wrong configuration for ElasticSearch?

warkolm · June 14, 2015, 2:40am

If you are getting this error then this is a LS problem, not an ES one.

swood · June 29, 2015, 7:57pm

Hello.

Yes, you're right. This is a complex problem.
In this moment I've changed index.refresh_interval to 15m. And increased count of physical servers for 50.
But, I have about 600 clients for import data to LS and ES. After 5 minutes after start LS process it stopped. It waiting ES, but I don't know why..
What I can do wrong?

warkolm · June 29, 2015, 9:49pm

A few things. Having 4 path.data entries that are the same location won't do anything. Increasing the threadpools like that will likely cause more problems than it is worth. Setting index.translog.interval to that means 30 milliseconds, not seconds.

You're also likely to be running into IO contention with that many nodes on that many disks, which won't help.

How much heap have you assigned to LS, to ES?
How much data do you have in the cluster?

Topic		Replies	Views
Logstash java heap memory causing delay to pushing events to elasticsearch Logstash	6	70	February 14, 2025
Errors while doing bulk update, Am I doing this wrong? Elasticsearch	10	1055	July 5, 2017
Large Scale elastic Search Logstash collection system Elasticsearch	6	452	July 6, 2017
Huge concurrent data ingestion to ElasticSearch Elasticsearch	16	2919	September 18, 2018
Cluster (ES 5.2) performance degrading after indexing Elasticsearch	3	530	June 6, 2017

Degradation performance bulk insert

Related topics