Query cache not being used


(Narcis Madern) #1

Good morning,

The problem we are facing is that query cache is not being used at all in our ES cluster.

We don’t have any configuration for disabling it (at least, that we are aware of), but still the query cache is not being used (always 0 bytes used):

"query_cache": {
"memory_size_in_bytes": 0,
"total_count": 0,
"hit_count": 0,
"miss_count": 0,
"cache_size": 0,
"cache_count": 0,
"evictions": 0
},

In contrast, the request cache is working just fine:

"request_cache": {
"memory_size_in_bytes": 189465492,
"evictions": 0,
"hit_count": 25253,
"miss_count": 4158
},

As far as I know, query cache is used to cache partly filters, so it could be used in different queries, as long as those queries share part of the filters, so it is really useful when queries are different but they share part of the filters being applied.
The fact that this query is not used is hurting us, since we use the cache only when queries are exactly the same (request cache).

Could you please help us understand why this cache is not being used in our case?

According to some answers I found on the internet (for example https://stackoverflow.com/questions/48536363/why-is-my-elasticsearch-query-cache-empty), version 6.x of ES has problems with this cache, but we are using version 5.x and, moreover, the page I referenced states that the problem is only with term filters, but we also have several ranged filters, so those should use the query cache without problems.

Just in case this information is giving any clue:
We are using indices of about 2 to 10 million documents, and we are computing several aggregates (and sub-aggregates) every time we request ES. Also, our queries always contain some filtering (usually terms filters – some of them are almost always present in the queries with the same values -, but many other times also ranged filters).

Any advice or idea could really help us, since we are totally blind right now.

Thanks in advance


(Narcis Madern) #2

It seems that, unfortunately, nobody knows how to solve this issue or, at least, nobody can help us understand where we might be doing something wrong :pensive:


(Luca Wintergerst) #3

Hi Narcis,
as of Elasticsearch 6.x we no longer cache term queries as it's usually quicker and cheaper to just run the query. This is something that was removed from Lucene itself, which is why Elasticsearch no longer does it.
So if you're only running term queries it's expected to not utilize the query_cache.

I'm still curious why the range queries are not cached though. There could be a number of reasons for this.
Can you please show me the exact query that you are sending Elasticsearch?
Please give me an example where you are using a range query in combination with terms.

Have a good weekend!
Luca


(Narcis Madern) #4

Hi Luca,

We are still using ES 5.x, so it should be caching terms as well.
In any case, our requests are really heavy because they include a LOT of aggregations to be computed (and a sub-aggregation for every one of them for calculating all the counters).

Moreover, most of our requests contain some few term filters that are almost always there, so this should take advantage of query cache, as I understand it.

You will find an example of request below (warning: brick wall :stuck_out_tongue: ).
Notice the range filters included. Ranges like this regarding party composition are most of the time repeating from query to query (but they might additionally include other filters), so this could take really a big advantage of the query cache. Is that statement correct?

Thanks for your reply and for trying to help us out.

GET index/_search
{

"size" : 0,
"query": {
"bool": {
"filter": [
{
"range": {
"partyComposition.minAdults": {
"gte": "0",
"lt": "100"
}
}
},
{
"range": {
"partyComposition.maxAdults": {
"gte": "0",
"lt": "100"
}
}
},
{
"range": {
"partyComposition.minOccupancy": {
"gte": "0",
"lt": "100"
}
}
},
{
"range": {
"partyComposition.maxOccupancy": {
"gte": "0",
"lt": "100"
}
}
}
]
}
},
"aggs" : {
"priceProperties.departureAirport" : {
"terms" : {
"field" : "priceProperties.departureAirport",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"priceProperties.duration" : {
"terms" : {
"field" : "priceProperties.duration",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.collection" : {
"terms" : {
"field" : "accoProperties.collection",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"priceProperties.departureDate" : {
"terms" : {
"field" : "priceProperties.departureDate",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"priceProperties.mealPlanCode" : {
"terms" : {
"field" : "priceProperties.mealPlanCode",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.countryAgnosticKey" : {
"terms" : {
"field" : "accoProperties.countryAgnosticKey",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.regionAgnosticKey" : {
"terms" : {
"field" : "accoProperties.regionAgnosticKey",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.cityAgnosticKey" : {
"terms" : {
"field" : "accoProperties.cityAgnosticKey",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.stars" : {
"range" : {
"field" : "accoProperties.stars",
"ranges" : [{
"from" : 1.0,
"to" : 6.0,
"key" : "1"
}, {
"from" : 2.0,
"to" : 6.0,
"key" : "2"
}, {
"from" : 3.0,
"to" : 6.0,
"key" : "3"
}, {
"from" : 4.0,
"to" : 6.0,
"key" : "4"
}, {
"from" : 5.0,
"to" : 6.0,
"key" : "5"
}
]
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.themes" : {
"terms" : {
"field" : "accoProperties.themes",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"priceProperties.transportType" : {
"terms" : {
"field" : "priceProperties.transportType",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"priceProperties.discounts" : {
"terms" : {
"field" : "priceProperties.discounts",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.accoType" : {
"terms" : {
"field" : "accoProperties.accoType",
"size" : 500
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},
"accoProperties.metersToCenter" : {
"range" : {
"field" : "accoProperties.metersToCenter",
"ranges" : [{
"from" : 0.0,
"to" : 50.0,
"key" : "0"
}, {
"from" : 0.0,
"to" : 100.0,
"key" : "0-100"
}, {
"from" : 0.0,
"to" : 250.0,
"key" : "0-250"
}, {
"from" : 0.0,
"to" : 500.0,
"key" : "0-500"
}, {
"from" : 0.0,
"to" : 1000.0,
"key" : "0-1000"
}, {
"from" : 1000.0,
"key" : "1000-"
}
]
},
"aggs" : {
"distinct_acco" : {
"cardinality" : {
"field" : "accoProperties.accoDeskId"
}
}
}
},

  "is_last_minute" : {
  	"date_range" : {
  		"field" : "priceProperties.departureDate",
  		"format": "dd-MM-yyyy",
  		"ranges" : [{
  				
  				"to": "21-09-2019"
  			}
  		]
  	},
  	"aggs" : {
  		"distinct_acco" : {
  			"cardinality" : {
  				"field" : "accoProperties.accoDeskId"
  			}
  		}
  	}
  },
.........................

SOME AGGREGATIONS HAVE BEEN CUT BECAUSE TEXT IS LIMITED WITHIN THE FORUM


(Luca Wintergerst) #5

Hi Narcis,
just letting you know that I did not forget you. Having a busy week but I'll make sure to get you your answer.
Luca


(Luca Wintergerst) #6

Hi Narcis,
I followed up with one of my colleagues on this.
Term queries are not cached as of 5.1. Which version are you running exactly?

You mentioned that the range queries are executed frequently. How often is a single range filter executed per day?
You are right, caching this should give you an advantage.

Can you also send me the full query, by uploading it as a gist and linking to it here? Maybe there's something in it that explain the behaviour, but to be certain I need to see the full json request. Please make sure that it is the full request and it's not cut somewhere.
Can you also send me the output of GET _cluster/settings and GET _cluster/settings?include_defaults=true

I'll promise to reply faster next time, thanks for your patience


(Luca Wintergerst) #7

managed to find the root cause in a private chat. This happens for versions 5.3.x to 5.4.0

the fix is to upgrade to >5.4.1