Migration from ES1.5.4 TO ES 6.3 HDD performance issue

Denis_Shilov · July 19, 2018, 2:02pm

Hi.

We run our db ES 1.5.4 and decide to move it to 6.3.

Few words about environment:
Everywhere we use same SATA HDD, for each version of ES we use cluster with 2 servers 32Gb RAM.
We tried different RAID settings. And every field at ES 6.3 has own mapping (long/short/keyword, nothing else). We check all best practice advices. But problem still exists.
Also we try single server with 1 node of ES 6.3

Finally, about our issue:
We start parallel writing to both db and get high disk load at ES 6.3.
While at 1.5.4 we have ~800 write ops/10min at 6.3 we got 40 000 write ops/10 min on the same amount of data.
It's really confusing.

For ES 1.5.4 we use Elastica lib 3.1, for ES 6.3 we use Elastica lib 6.0.1

Settings 1.5.4:

 "settings" : {
   "index" : {
     "number_of_replicas" : "1",
     "number_of_shards" : "5",
     "refresh_interval" : "5s",
   }
 }

Settings 6.3:

"settings" : {
  "index" : {
    "refresh_interval" : "5s",
    "number_of_shards" : "5",
    "provided_name" : "male_v2",
    "merge" : {
      "scheduler" : {
        "max_thread_count" : "1"
      }
    },
    "number_of_replicas" : "1",
  }
}

Christian_Dahlqvist · July 19, 2018, 2:11pm

Are you using bulk requests? If so, what bulk size are you using? Are you just indexing new data or are you also updating existing documents? If you are updating, how frequently are you updating individual documents?

Denis_Shilov · July 19, 2018, 2:54pm

Hi, Christian

We didn't use bulk requests at both versions by design of our system. We require asap changes.

We index and update existing documents. At ES 1.5.4 we almost always update docs and it's about 44 millions updates per day AND about few thousands of inserts. At 6.3 amount of updates about 70% and grows up because of we fill db.

At both db always same docs amount processed at writes. At 6.3 no read ops, at 1.5.4 about 3000 per/s.

Best regards.

Christian_Dahlqvist · July 19, 2018, 3:01pm

One change between Elasticsearch 1.x and 6.x that will have an impact here is the fact that the translog is now synced to disk by default per request rather than periodically, especially since you are not using bulk requests. This was done to improve resilience and prevent data loss, but will led to increased I/O.

If you are updating documents that have not yet been published to a segment through refresh, a refresh will be performed. This can result in a lot of small segments and disk I/O if you update the same document frequently.

Denis_Shilov · July 20, 2018, 2:28am

It's helps, thanks a lot.

system · August 17, 2018, 2:28am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk write to ES \| best practices Elasticsearch es-hadoop	4	5575	July 6, 2017
Initial Upload ElasticSearch 6.3 Bulk insert slows to a crawl Elasticsearch	3	789	September 9, 2018
Issues in migrating ES 1.4.4 to ES 6.4.2 Elasticsearch	1	296	March 18, 2019
Low performance on bulk insert with custom mapping Elasticsearch	9	496	December 3, 2018
ES 6.2.3 performance issues vs ES 2.3.4 Elasticsearch	13	1192	November 28, 2018

Migration from ES1.5.4 TO ES 6.3 HDD performance issue

Related topics