Better control over the request cache

We have a certain set of boolean-filtered term queries with aggregations (using "size": 0) that are slow to run (5-10 seconds over 300M documents in 60 shards) yet are crucial to our app. Some of these queries match all documents in the index if the user selects a wide enough range. I'm giving plenty of memory to the shard request cache and node query cache. In addition, we refresh our indexes infrequently and are willing to pay at index time for fast queries.

Currently Elasticsearch seems to decide on its own when to cache a certain result. I've had to rerun each query 4-5 times in a row before it is cached. Since my set of crucial queries could be up to a few thousand distinct queries with all filter/aggregation permutations, it's unlikely that they will all be used frequently enough to be cached, and it's not a reasonable workaround to rig up a system that occasionally reruns them to ensure they stay cached.

What I'd like to be able to do is have more control over how these queries are cached: cache them the first time they are issued, and ensure they will never be evicted unless the index is refreshed.

Without better control over the caching, the only solution I've seen online is an undocumented setting index.queries.cache.everything: true, which I'd rather not rely on (and I don't know if it even still works).

Are there any other levers I can use to control caching?

Hi @Jared_Miller,

you can enable caching per request but since you state:

my set of crucial queries could be up to a few thousand distinct queries

I fear this setting will not be very effective. You can adjust the cache settings and increase the cache size which might help but OTOH you cannot define an eviction policy. You should also be aware that the exact query you issue is used as a cache key.


Thanks for the reply. Does caching per request cause the query to be cached immediately (i.e., Elasticsearch doesn't wait to see N of the same query over the past M queries in the history)? That would still be very useful if used with a large cache size. The docs don't specify.

Hi @Jared_Miller,

the query is cached after the first invocation. You can just try it yourself and issue one of the queries (e.g. with curl) and measure the response time. After the first execution, the response time should be much faster.

However, there are a few additional conditions which can prevent that the request can be cached, even if you specify the request_cache flag (e.g., if you use now() in your queries). If you're able to read a bit of Java, you can look at the source code of the relevant method on Github.