Horizontal scaling of indexing

tinle · October 11, 2015, 12:09am

Glad to hear someone else seeing same problems we are seeing. We're using slightly different HW with similar results.

See my old post here:

Once we got past the testing harness setup, we were able to reproduce the slow indexing performance internally.

Our HW is:

Virident PCIe SSD card (config for performance) 1.8TB
64GB RAM
2x12 core Xeon (HT on, or equiv of 48) (5 physical bare metal nodes x 2 sets for faster testing of various parameters combination)
ES v1.7.2
JDK 8u60 (also tested with JDK7u51, JDK8u40)
Tested various maxheap from 16G to 31G.
mlockall on
max fd is 64K
refresh interval is -1
index.store.throttle.type: none
index.store.throttle.max_bytes_per_sec: 700mb
index.translog.flush_threshold_size: 1gb
indices.memory.index_buffer_size: 512mb
5 shards so we get 1 per node
no replica
various doc size from 1k to 16K
same data set on a RAMdisk so we always read same data via logstash file input
tested with 1 LS instance, 5 instances, 20 instances, etc.
Various bulk indexing sizes (100, 500, 1000, 5000, 10000, etc.).

Our conclusion is that I/O, CPU and memory are not the problem. We always hit a limit in how fast ES can index.

How are you ingesting data? Logstash? or your own client doing bulk insert? You can try increasing the number of instances feeding ES. We notice a slight increase in indexing speed, but it falls off after 10 concurrent LS instances into the 5 ES nodes.

Topic		Replies	Views
How does indexing performance vary over increase in number of nodes? Elasticsearch	10	2064	July 5, 2017
Index performance does not increase linearly Elasticsearch	8	883	October 27, 2018
ElasticSearch Bulk indexing is not scaling Elasticsearch	7	2978	July 5, 2017
Adding nodes does not seem to speed up indexing Elasticsearch	8	1053	July 6, 2017
Inserts get slower when index become large Elasticsearch	10	476	July 6, 2017

Horizontal scaling of indexing

Related topics