You have a high volume logging use case and have followed these recommendations:
What other knobs can you turn to improve indexing performance?
- Set index.refresh_interval to 30s to 60s if near real time search is not a requirement. By default, this is set to 1s.
- Increase indices.memory.index_buffer_size. It defaults to 10% of the total JVM heap allocated to a node that is to be used as the indexing buffer across all active shards.
- Disable _field_names if exists query is not in use.
You can find more details here.
If your use case can tolerate increased risk of data loss in event of hardware failures, these options will push the write througput even further.
- Boost index.translog.flush_threshold_size. Once the translog reaches the specified size, a flush will take place. Defaults 512mb.
- Set index.translog.durability to
async
. This setting is risky as all acknowledged writes since the last commit will be discarded if a hardware failure should occur. Depending on the use case, this may be worth considering.
All except indices.memory.index_buffer_size are index level settings and dynamically configurable. Also, you can add them to an index template to make them defaults for all indices matching the index-patterns
. For example,
PUT _template/logs
{
"order": 0,
"index_patterns": "logs-*",
"settings": {
"refresh_interval": "30s",
"number_of_shards": "3",
"translog": {
"flush_threshold_size": "1gb",
"durability": "async"
},
"unassigned": {
"node_left": {
"delayed_timeout": "5m"
}
},
"query": {
"default_field": "message"
},
"number_of_replicas": "1"
},
"mappings": {
"doc": {
"_field_names": {
"enabled": false
}
}
}
}
And what does the future hold for better indexing performance? In the up and coming Elasticsearch version 7, we are working towards an intelligent refresh, where we will skip the refresh
on a shard that has not been searched on for a period of time (30s by default) and perform the refresh
at the next scheduled interval if a search request should arrive for the shard.