Slow Indexing rate


(Ramky) #1

I am trying to index json events with 16 string fields each and maximum EPS that i am able to index is 5k. These json events are fed to elasticsearch by traversing list of strings which are loaded from a file . Elastic Search cluster consists of 3 nodes with 32 GB RAM and 8 core CPU.

We are performing bulk indexing with following settings

index.number_of_shards=5
index.number_of_replicas=1
index.translog.flush_threshold_ops=50000
index.refresh_interval=30
indices.memory.index_buffer_size=50%
indices.fielddata.cache.size=20%
indices.fielddata.cache.expire=1h
indices.cache.filter.size=20%
indices.cache.filter.expire=1h
bulkprocessor.BulkActions=25000
bulkprocessor.BulkSize=15
bulkprocessor.FlushInterval=15
bulkprocessor.ConcurrentRequests=10

Please help to increase indexing speed

Regards
Rama Krishna P


(Adrien Grand) #2

When the bulk test is running, is Elasticsearch maxing out CPU and/or I/O? If not then maybe you just need to send data from more threads?


(Ramky) #3

Thanks for reply.
Indexing 1 million events with single thread, EPS (events per second) is around 7k, but when tried using 4 threads EPS is around 1.7k.

Memory Consumption is 28% (nearly 9GB) CPU is idle 96%, Maximum CPU usage is 10%.
Out of 1 million events, indexing speed is around 40k EPS till 700k events, but overall processing rate is around 7k EPS.

Below is I/O taken from one machine
Before:
avg-cpu: %user %nice %system %iowait %steal %idle
0.54 0.00 0.13 1.70 0.00 97.63

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 11.45 458.74 199.38 722502 314016
dm-0 25.07 434.87 105.50 684914 166160
dm-1 0.20 1.64 0.00 2576 0
dm-2 9.32 18.01 72.04 28370 113456

During indexing:
avg-cpu: %user %nice %system %iowait %steal %idle
1.27 0.00 0.14 3.81 0.00 94.78

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 10.81 358.03 1294.19 722886 2613096
dm-0 19.67 339.36 83.17 685194 167928
dm-1 0.16 1.28 0.00 2576 0
dm-2 149.50 14.09 1193.99 28458 2410768


(Adrien Grand) #4

So your system looks essentially idle. There must be an issue with the benchmark somehow. Maybe the bottleneck is to read data from your file?

Also for the record, setting time expiry on the fielddata and filter caches is almost always wrong as fielddata entries are very expensive to regenerate and filter entries can't go outdated as they are cached per segment.


(Ramky) #5

Thanks for reply.

The file is loaded only once and maintained in-memory and stats are collected after loading the file. So, i don't think reading data from file is bottleneck.

Even though disabled the expiry settings on field data and filter cache, there isn't any change in indexing rate.


(Adrien Grand) #6

Then I'm very confused why you can't max out either I/O or CPU on your server. There must be something wrong somewhere... Are you using bulk or separate indexing requests to load data? Do you have good network connectivity between your server and the machine that runs the benchmark?

Right I didn't expect it to change indexing rate, this was more a side note.


(Ramky) #7

I created one transport client object for 3 node cluster and send JSON events from java util list loaded from file to cluster using bulk index API.

Following are parameters used for bulk indexing API :
bulkprocessor.BulkActions=10000
bulkprocessor.BulkSize=10
bulkprocessor.FlushInterval=20
bulkprocessor.ConcurrentRequests=10

I tried multiple ways but unable to increase CPU or Memory utilization and also indexing rate.

Moreover all machines are in 1 GBPS local LAN


(Adrien Grand) #8

Sorry I don't have more ideas about what is wrong, but there certainly is something...


(Ramky) #9

Thanks for your help in trying to address the issue. Based on few trail and error methods able to get 50k EPS.


(Chang Oh Heo) #10

Hi, I got a similar problem.
Could you let me know how to solve the problem?


(Chen Jian) #11

Got similar issue, ES data node CPU is idle and index rate is extremely slow, any suggestions?


(Christian Dahlqvist) #12

Please open a new thread for your question and provide more details around your setup and achieved performance.


(system) #13