Does it really matter what values we set in refresh_interval for a read-only index? I'm re-indexing a huge index that contains legacy data and is never updated i.e read-only (RO).
Before re-indexing, in order to boost performance I'm setting "refresh_interval": -1. Once re-indexing is over, it is necessary for me to reset the refresh_interval back to say "30s" or "60s".
I'm wondering if not re-setting back to 30s or 60s will make any difference since this index isn't going to get any data post re-indexing.
Yes in theory you need to at least refresh it once when your done, unless something else did it for you. Thats why @dadoonet says you can leave it disabled but need to refresh it manually after.
For the benefit of future readers, in 7.0 and later the right thing to do in this case is to leave index.refresh_interval unset, thanks to the new index.search.idle.after setting. The default behaviour is now not to refresh indices that are not being searched, even if you are indexing into them.
Way back in 5.5.1, however, I think the best thing to do is as David suggests: leave it at -1 and explicitly call POST /index/_refresh when you have finished indexing. I think you will also see no ill effects if you set it to a time like 30s or 60s (or even back to the default of 1s) since Elasticsearch will skip refreshes on indices that have not changed since the last refresh.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.