Best Translog configuration for ES 2.2.0

rusty · March 3, 2016, 10:19am

We have to index and search ~15-17B small documents a day. Cluster consist of 24 data nodes (12 servers with 2 ES instances each = 24 data nodes). Each shard is around 30GB on disk (48 shards) and around 55GB (24 shards). Smaller shard number worse in terms or rebalance time and awful for recovery time.

As we have a problem with bulk queue (link) there was assumption is that index writer is single threaded for a shard and shard number should be closer to number of concurent bulk indexing threads. Another assumption was what we hit into some lucene limit in document count per shard so indexing latency increases.

As for refresh interval and other index settings:

"settings": {
"index": {
"codec": "best_compression",
"refresh_interval": "30s",
"bloom": { "load": "false" },
"number_of_shards": "48",
"number_of_replicas": "1",
"store": {
"throttle": { "type": "none" }
},
"merge": {
"policy": {
"reclaim_deletes_weight": "0.0"
}
},
"mapper": {
"dynamic": "false"
},
"ttl": {
"disable_purge": "true"
}
}
}

We use 2.1.2 (this guide a little outdated) and try different approaches and settings but always open to all advices.

As for 20K we are still try to fugure out best setting, it was set at 1.7.x and worked well. Now we will try no change it and see if there be any change.

Topic		Replies	Views
Async index.translog.durability Elasticsearch	6	4339	August 18, 2017
Index translog grows past the configured limit Elasticsearch	4	493	July 5, 2017
Translog Settings Elasticsearch	2	905	July 5, 2017
Confuse about 'index.translog.durability' Elasticsearch	7	1539	July 22, 2020
Slow performance compare to v1.7 when using 5.x Elasticsearch	3	619	October 20, 2017

Best Translog configuration for ES 2.2.0

Related topics