Trying to understand the filter cache


(Carlos Terron) #1

Hi

I'm trying to understand the filter cache. My base scenario is a index created with this mapping
{ "mappings":{ "logs":{ "properties":{ "timestamp":{ "type": "date" }, "host":{ "type": "string", "index": "not_analyzed" }, "message":{ "type": "string" } } } } }
I load the data and I can make queries like this:

{ "query": { "filtered": { "query": { "match_all": {} }, "filter": { "range": { "timestamp": { "lte": 1452785535, "gte": 1452763935 } } } } } }

A simple range filter between two timestamps. I have some question about this kind of query:

  • The filter is cached? I think that no. Also, I have read in All about caching that in this type of queries is better to disable the caching.
  • The fielddata we load from this queries is cached in RAM and can be reused for other similar query with different timestamp? i.e. if a make a query asking for the last hour from now and then make another one with the last two hour the field data of the first hour is cached by the previous query?

(Adrien Grand) #2

Hi Carlos,

This query will not load fielddata in RAM, it can work directly on the inverted index.

Regarding caching, the match_all will never get cached (it is already super fast) however the range filter will get cached if it is reused across queries.


(Carlos Terron) #3

Thanks

but the range only if it is the same query. If I change the "gte" or "lte" to other timestamp, I'm out of this caching, or ES is intelligent and add / remove to the previous data.


(system) #4