Filter cache configuration

Hi all,

I'm in the case where my queries are always filtered by an id group
composed of letters (usually 4)
The list of groups can be big, up to 2000.

I've seen available options from the following page
http://www.elasticsearch.org/guide/reference/query-dsl/terms-filter.html

But i'm not sure to understand them.
Should using options, "execution" : "bool", "_cache": true, improve
performance ?

What configuration of indices.cache.filter.size are you using ? I feel 20%
quite low.
(From this page
http://www.elasticsearch.org/guide/reference/index-modules/cache.html)

Any other tips or recommendations about filter cache tunning ?

Thanks, Benoît

--

Hello Benoît,

On Thu, Oct 25, 2012 at 4:51 PM, Benoît benoit.intrw@gmail.com wrote:

Hi all,

I'm in the case where my queries are always filtered by an id group composed
of letters (usually 4)
The list of groups can be big, up to 2000.

I've seen available options from the following page
Elasticsearch Platform — Find real-time answers at scale | Elastic

But i'm not sure to understand them.
Should using options, "execution" : "bool", "_cache": true, improve
performance ?

If you filter by only one group ID, I think the term filter would be preferable:

Which would be cached by default and the problem is solved :slight_smile:

If you need to filter by more than one group ID, I would choose
depending on what kind of queries are run most often. If users
normally choose the same sets of group IDs, I would leave caching
settings to default. Otherwise, I would only change the execution to
"bool", and I wouldn't turn on _cache because that would also cache
the bool filter - and I assume that would change too often - while
individual term filters will be cached by default.

What configuration of indices.cache.filter.size are you using ? I feel 20%
quite low.
(From this page
Elasticsearch Platform — Find real-time answers at scale | Elastic)

Any other tips or recommendations about filter cache tunning ?

I think if you want to fine-tune your cache settings, there's no
getting away from some testing and monitoring. Here's an interesting
blog article that might help:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

Hello,

Thank very must for your explanations.

On Thursday, October 25, 2012 11:01:43 PM UTC+2, Radu Gheorghe wrote:

Hello Benoît,

On Thu, Oct 25, 2012 at 4:51 PM, Benoît <benoit...@gmail.com <javascript:>>
wrote:

Hi all,

I'm in the case where my queries are always filtered by an id group
composed
of letters (usually 4)
The list of groups can be big, up to 2000.

I've seen available options from the following page
Elasticsearch Platform — Find real-time answers at scale | Elastic

But i'm not sure to understand them.
Should using options, "execution" : "bool", "_cache": true, improve
performance ?

If you filter by only one group ID, I think the term filter would be
preferable:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Which would be cached by default and the problem is solved :slight_smile:

If you need to filter by more than one group ID, I would choose
depending on what kind of queries are run most often. If users
normally choose the same sets of group IDs, I would leave caching
settings to default. Otherwise, I would only change the execution to
"bool", and I wouldn't turn on _cache because that would also cache
the bool filter - and I assume that would change too often - while
individual term filters will be cached by default.

Yes this is my use case .

What configuration of indices.cache.filter.size are you using ? I feel
20%
quite low.
(From this page
Elasticsearch Platform — Find real-time answers at scale | Elastic)

Any other tips or recommendations about filter cache tunning ?

I think if you want to fine-tune your cache settings, there's no
getting away from some testing and monitoring. Here's an interesting
blog article that might help:
ElasticSearch Cache Usage - Sematext

Oh yes, excellent article, i knew it but don't bookmark it, that done now !

Regards.

Benoît

--