I get the following query time pattern, regardless of queries used:
1st search: ~2s
2nd search: ~2s
3rd search: < 1s
4th search: < 1s
...
Xth search: < 1s
My questions are:
Why does it speed up on the third query (I suspect this is from some kind of OS or ES level caching in the background)?
Is there some way I can also preemptively do this?
This question is just out of curiosity, I am perfectly happy with the current performance.
Which question are you answering? Also, can you please elaborate? Thank you in advance. Also, it seems like it's not specifically on the third query. I retried the setup but this time, it speeds up on the fourth query.
Sorry, question 1. If your client is balancing requests across replicas any caching on your primary shard will not benefit a follow-up query routed to your replica.
To send each user back to the same replica each time and increase the chances of hitting a warm cache use their sessionID as a routing preference
Looking at the docs there it seems randomization and not round-robining is the default replica selection policy which may explain some of the inconsistencies.
If I understand correctly, then the first search will always be longer than the others? The second search is not guaranteed to be the same node as the first search's and the search improves for the third and all subsequent searches because I have three nodes (which means the possibility of hitting a warm cache by the third try is 83%, 100% on 4th)?
That makes a lot of sense, thank you. For question 2 then, is there a way to 'warm up' the caches before the user searches?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.