I understand that filters are used in conjunction with routing and aliases,
but it also seems that they can often be used interchangeably with queries.
Is there a technical difference, e.g. with regard to caching mechanisms, to
how they are implemented?
Queries and filter are similar in terms of their ability to match
documents. However, filters do not perform any scoring and can be cached.
As a consequence, you should use filters instead of queries when you do not
care about the produced scores.
I understand that filters are used in conjunction with routing and
aliases, but it also seems that they can often be used interchangeably with
queries.
Is there a technical difference, e.g. with regard to caching mechanisms,
to how they are implemented?
partially "close the gap" in performance? The cache won't be as lightweight
as the filter cache (as the key is the entire JSON query rather than a
bitset and operates at shard-level), but is a cache nonetheless.
Or am I comparing two completely different things?
Filters work based on caching bitsets to provide a fast lookup of documents
that match the criteria. It looks like the query cache actually stores the
hits returned from within the node. For now, it looks like query cache will
not help with fetching documents (says count/aggregation/suggestions only),
but when/if it does work to store search results, it's not optimized for
frequent updates on the index (since that will invalidate the cache
result). Filters, on the other hand, are updated aggressively by ES, so new
documents get incrementally added to the bitset.
So to answer the question, they both have similar goals, but they
accomplish them differently, so you will have to decide which
implementation fits your needs better.
On Tuesday, January 13, 2015 at 1:49:06 PM UTC-8, AndrewK wrote:
partially "close the gap" in performance? The cache won't be as
lightweight as the filter cache (as the key is the entire JSON query rather
than a bitset and operates at shard-level), but is a cache nonetheless.
Or am I comparing two completely different things?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.