Just pushed support for numeric_range filter. Its exactly like the range
filter in syntax, but uses the field data cache to perform the range filter
instead of the regular (lucene) range filter.
How does it work? The regular range filter (which also handles numeric
values) uses the structure of how numeric data is indexed (Trie based) to
fetch all the matching docs and create a bitset for it (each position maps
to a doc_id, bit with value 1 means a hit). This is always computed,
regardless of the query executed. The result of the filter is cached by
elasticsearch, so any subsequent calls using the same range values will be
really fast as they don't have to be computed again. The reason this is
always cached is because it is already in the form (bitset) of a cached
The numeric_range filter uses the the field data cache in order to do the
filtering. The field data cache basically uninverts the index, and stores
the value(s) of a field indexed by doc id. The field data cache is used when
sorting, or when using facets. This will usually be much faster than the
regular range filter, as it will only compute and filter per doc (that the
master query matches on) and the computation is really fast. This comes at
the cost of loading all the field values to memory, which might be ok if
already using it for faceting / sorting. This filter result is not cached
by default (as caching requires passing all docs and computing against it).
When do you which? If you have an age filter for "teens" (>10, <20), then
using the regular range filter is great choice. This filter is going to be
repeated a lot in different search operations, and the range filter caches
the results. No need to load the age field into the data cache.
If, on the other hand, a range filter that can't be cached easily (since
it does not have repetitive fomr/to) then the numeric_range is a great
candidate to give that query a boost.
As a side note, the best solution is for things to be automatic and not
exposed to the user (at least the defaults should be). I am working on
trying to write something that will automatically use the numeric_range
filter if the field is already loaded on the field data cache.