I am building an application that performs aggregations over time-series
data.
The prevailing advice for my situation seems to be that I should use
filters rather than queries to provide scope for my aggregations. The
reasons being
1) I have no need for scoring
2) I will be able to take advantage of filter caching.
However, a very common use case is for my users to scope aggregations to a
completely arbitrary time range. This means it is relatively unlikely to
receive many requests scoped to exactly the same time range.
If I implement this using a range filter, does this mean for filter
caching? Is ElasticSearch going waste time and memory building a separate
filter cache for each individual range it sees? (and should I hence use a
query?) Or is it smarter than that?
Yes , Elasticsearch is going to create filter cache per filter.
But then if you want to over run this behavior , you can put _cache as
false in your query as follows -
"filter" : {
"fquery" : {
"query" : {
"query_string" : {
"query" : "this AND that OR thus"
}
},
"_cache" : false
}
I am building an application that performs aggregations over time-series
data.
The prevailing advice for my situation seems to be that I should use
filters rather than queries to provide scope for my aggregations. The
reasons being
1) I have no need for scoring
2) I will be able to take advantage of filter caching.
However, a very common use case is for my users to scope aggregations to a
completely arbitrary time range. This means it is relatively unlikely to
receive many requests scoped to exactly the same time range.
If I implement this using a range filter, does this mean for filter
caching? Is Elasticsearch going waste time and memory building a separate
filter cache for each individual range it sees? (and should I hence use a
query?) Or is it smarter than that?
Yes , Elasticsearch is going to create filter cache per filter.
But then if you want to over run this behavior , you can put _cache as
false in your query as follows -
"filter" : {
"fquery" : {
"query" : {
"query_string" : {
"query" : "this AND that OR thus"
}
},
"_cache" : false
}
I am building an application that performs aggregations over time-series
data.
The prevailing advice for my situation seems to be that I should use
filters rather than queries to provide scope for my aggregations. The
reasons being
1) I have no need for scoring
2) I will be able to take advantage of filter caching.
However, a very common use case is for my users to scope aggregations to
a completely arbitrary time range. This means it is relatively unlikely to
receive many requests scoped to exactly the same time range.
If I implement this using a range filter, does this mean for filter
caching? Is Elasticsearch going waste time and memory building a separate
filter cache for each individual range it sees? (and should I hence use a
query?) Or is it smarter than that?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.