I want all index data files to be locked into RAM so queries execute fast regardless of HDD usage.
I added
index.store.preload: ["*"]
bootstrap.memory_lock: true
to elasticsearch.yml.
But when I execute queries, I still see spikes of HDD iops during requests (index is in read only state) and queries execute slow for the first time. Running the same query 2nd time is fast, because data were recently read from disk.
curl -X GET "localhost:9200/_nodes?filter_path=**.mlockall"
reports that "mlockall":true
What's wrong? Is it possible to setup elastic so that all data files are always in RAM?
Elasticsearch (i.e. Lucene) relies on the OS page cache to keep a copy of the on-disk index data in RAM if possible. It sounds like your operating system is paging that data out because it needs the RAM for other purposes.
mlockall is about swapping, which is basically the opposite process.
The short answer is "no", sorry. You are correct that mlockall can lock mmapped pages (and what I said was incorrect) but I think this would be quite bad if your indices exceeded your physical memory. There's a detailed explanation of the situation here:
I see. It's a sad, because sometimes you have more RAM than index data files, and search performance sucks just because VM paged out some data and search leads to excessive slow reads...
Following your login "rm" command should not exist too, because one could do "rm -rf /" and be in trouble
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.