ES 2.1.1 poor indexing performance compared to 1.7.3

zamblauskas · January 6, 2016, 10:46am

Trying to understand the reason why 2.x ES has such poor indexing performance compared to the older 1.x.
Vanilla ES install, simple index request:

HDD machine: 1.7.3 ~20ms, 2.1.1 ~100ms.
SSD machine: 1.7.3 ~20ms, 2.1.1 ~25ms.

Seems like related to some disk operation.
Are there any configuration options we should consider to keep the performance similar to what it was with 1.x ES ?
We are running single node, using it for real time application log monitoring (ELK stack), so doing mostly index operation.
Any other suggestions would be appreciated.

tar -xvf elasticsearch-1.7.3.tar.gz cd elasticsearch-1.7.3/ ./bin/elasticsearch time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY3QLS4y0FaHRm6Ax","_version":1,"created":true} real 0m1.377s user 0m0.000s sys 0m0.012s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY3tRS4y0FaHRm6Ay","_version":1,"created":true} real 0m0.019s user 0m0.007s sys 0m0.004s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY38tS4y0FaHRm6Az","_version":1,"created":true} real 0m0.027s user 0m0.004s sys 0m0.007s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY4KGS4y0FaHRm6A0","_version":1,"created":true} real 0m0.021s user 0m0.006s sys 0m0.005s

tar -xvf elasticsearch-2.1.1.tar.gz cd elasticsearch-2.1.1/ ./bin/elasticsearch time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZBOvQObEkTWKsmWJ","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m1.226s user 0m0.012s sys 0m0.000s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZBtZQObEkTWKsmWK","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m0.090s user 0m0.006s sys 0m0.006s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZB7uQObEkTWKsmWL","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m0.098s user 0m0.009s sys 0m0.000s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZCK2QObEkTWKsmWM","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m0.095s user 0m0.011s sys 0m0.000s

For our production setup the performance downgrade is even more dramatic - 1.x ~2000 index requests / sec vs 2.x ~30 index requests / sec on HDD machine, but first would like to get help on understanding the poor performance of vanilla setup.

jprante · January 6, 2016, 12:46pm

Elasticsearch 2.x has changed behavior for translog write sync. It fsyncs after every operation.

See https://github.com/elastic/elasticsearch/issues/14399#issuecomment-152744759

Documentation is still saying that it's 5 seconds, but that is no longer the case.

The problem with that decision is that it puts a high burden of extra I/O on spindle disks where the journal file system may not be capable to keep up with the extra writes to perform. You can either reinspect your file system mount options to tune the journal write operations, or reset the translog sync interval to the old settings of 5 seconds to remedy the situation.

zamblauskas · January 6, 2016, 2:01pm

index.translog.durability: async did solve our performance issue.
Thank you for your help.

bleskes · January 6, 2016, 8:03pm

Double checking - are you using the bulk API to index? if so we only sync after every bulk request (if needed - i.e., if not already synced by another bulk). If not, moving the the bulk API will help greatly with indexing speed.

zamblauskas · January 7, 2016, 9:56am

No, we are not using the bulk API. We are satisfied with the performance (with durability: async) and using single request per index operation helps keep the code little bit simpler.
But thanks for the tip, once our usage increases we'll look into switching to bulk operations.

dadoonet · January 7, 2016, 12:54pm

If you are looking for efficiency, I'd really recommend using bulk. It's really really much faster.

Bonus: you probably won't need to change durability settings which is risky IMO as you may loose data if unlucky...

Topic		Replies	Views
[Resolved] Elasticsearch 2.1.0 indexing 40% slower than 1.7.0 Elasticsearch	14	2537	July 5, 2017
Elasticsearch upgrade from 1.7.1 to 2.3.2 then create index very slow Elasticsearch	36	4576	July 5, 2017
BulkIndexing is~10X slower in 2.1 when index is dropped and recreated Elasticsearch	4	871	July 5, 2017
Slow bulk indexing Elasticsearch	4	2097	July 5, 2017
Speeding up indexing in ES 2.2.0 Elasticsearch	18	3781	July 5, 2017

ES 2.1.1 poor indexing performance compared to 1.7.3

Related topics