ES 2.1.1 poor indexing performance compared to 1.7.3

(Zamblauskas) #1

Trying to understand the reason why 2.x ES has such poor indexing performance compared to the older 1.x.
Vanilla ES install, simple index request:

  • HDD machine: 1.7.3 ~20ms, 2.1.1 ~100ms.
  • SSD machine: 1.7.3 ~20ms, 2.1.1 ~25ms.

Seems like related to some disk operation.
Are there any configuration options we should consider to keep the performance similar to what it was with 1.x ES ?
We are running single node, using it for real time application log monitoring (ELK stack), so doing mostly index operation.
Any other suggestions would be appreciated.

tar -xvf elasticsearch-1.7.3.tar.gz cd elasticsearch-1.7.3/ ./bin/elasticsearch time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY3QLS4y0FaHRm6Ax","_version":1,"created":true} real 0m1.377s user 0m0.000s sys 0m0.012s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY3tRS4y0FaHRm6Ay","_version":1,"created":true} real 0m0.019s user 0m0.007s sys 0m0.004s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY38tS4y0FaHRm6Az","_version":1,"created":true} real 0m0.027s user 0m0.004s sys 0m0.007s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWY4KGS4y0FaHRm6A0","_version":1,"created":true} real 0m0.021s user 0m0.006s sys 0m0.005s

tar -xvf elasticsearch-2.1.1.tar.gz cd elasticsearch-2.1.1/ ./bin/elasticsearch time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZBOvQObEkTWKsmWJ","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m1.226s user 0m0.012s sys 0m0.000s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZBtZQObEkTWKsmWK","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m0.090s user 0m0.006s sys 0m0.006s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZB7uQObEkTWKsmWL","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m0.098s user 0m0.009s sys 0m0.000s time curl -XPOST 'http://localhost:9200/a/a' -d '{}' {"_index":"a","_type":"a","_id":"AVIWZCK2QObEkTWKsmWM","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true} real 0m0.095s user 0m0.011s sys 0m0.000s

For our production setup the performance downgrade is even more dramatic - 1.x ~2000 index requests / sec vs 2.x ~30 index requests / sec on HDD machine, but first would like to get help on understanding the poor performance of vanilla setup.

(Jörg Prante) #2

Elasticsearch 2.x has changed behavior for translog write sync. It fsyncs after every operation.


Documentation is still saying that it's 5 seconds, but that is no longer the case.

The problem with that decision is that it puts a high burden of extra I/O on spindle disks where the journal file system may not be capable to keep up with the extra writes to perform. You can either reinspect your file system mount options to tune the journal write operations, or reset the translog sync interval to the old settings of 5 seconds to remedy the situation.

(Zamblauskas) #3

index.translog.durability: async did solve our performance issue.
Thank you for your help.

(Boaz Leskes) #4

Double checking - are you using the bulk API to index? if so we only sync after every bulk request (if needed - i.e., if not already synced by another bulk). If not, moving the the bulk API will help greatly with indexing speed.

(Zamblauskas) #5

No, we are not using the bulk API. We are satisfied with the performance (with durability: async) and using single request per index operation helps keep the code little bit simpler.
But thanks for the tip, once our usage increases we'll look into switching to bulk operations.

(David Pilato) #6

If you are looking for efficiency, I'd really recommend using bulk. It's really really much faster.

Bonus: you probably won't need to change durability settings which is risky IMO as you may loose data if unlucky...

(system) #7