If I reboot a node it takes a long time to get its shards initialized and come back up.
I now know I should disable shard allocation before rebooting which would solve that problem, but what if I just need to restart the service?
Say for example I just want to tweak the refresh interval. If I change that setting and restart the elasticsearch service (without disabling allocation or anything) will the node come up fast enough to come back online to the cluster immediately?
That is a dynamic setting on each index. You can change it using the setting API. For the default refresh interval I'd make an index template so that new indices get the refresh interval you want. Or look at whatever tool you have making the index and make it specify the refresh interval you want.
At this point the only thing we have to speed up restarts is synced flush and that only works if the index isn't being changed. That is useful but doesn't apply in all situations. We're working to make restarts faster even for indices being written to but that work isn't going to be ready until at least 6.0.
I don't think that is likely to help, really. you are sending docs way faster than you can sneak a restart.
There is another more graceful but slower option. You can use the allocation filtering feature to move shards that you are actively writing off of the node that you are going to restart. I did many upgrades that way when I maintained a cluster. We didn't have synced flush and I couldn't pause writes so it was the way that made the upgrades least impactful. It is slow. At least, it is much slower than synced flush.
Beyond that you wait for sequence number based replication to come in the 6.x branch. That should make the process of nodes catching up on documents that they missed much much faster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.