Hi,
I haven't seen any recent threads about this and was wondering if there is any way to speed up recovery on a node that has been restarted where a flush was not done before. I have the settings below but it is still taking a log time.
I have 16 nodes running on 4 servers (8 CPU and 16GB RAM each node), 11*1.8TB SAS drives and 128GB RAM on each server. This is for a Graylog backend that processes about 50,000 messages/second. Recovery is taking hours and causing my Graylog journal to fill and backup my processing. My shards are about 1.4GB each.
Is there a way to prevent the Elastic Search recovery from slowing down the inserts from Graylog? Would adding more Elastic Search nodes speed up recovery?
Thanks in advance
"persistent": {
"cluster": {
"routing": {
"allocation": {
"node_concurrent_recoveries": "8"
}
}
},
"indices": {
"recovery": {
"concurrent_streams": "8",
"translog_ops": "500000",
"max_bytes_per_sec": "200mb",
"file_chunk_size": "1mb"
}
}