Term vs terms filter


(Matthias Johnson) #1

I'm filtering on terms. it seems that there are 2 choices: term filterhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-term-filter.htmland terms
filterhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html
.

In my case I have a list of terms displayed, from which the user can select
to be filtered on. This works very well with the terms filter, since it
have an array. Additionally, the terms filter does just fine with a single
value in the array.

The terms filter also offers much in terms of flexibility for tweaking the
execution mode and caching, which the term filter does not.

That makes me wonder about the term filter and where it should be used.

Are there any usage guidelines for one over the other? Seems that passing
an array to terms would not be a lot different over building an array of
term filters inside an or filter.

@matthias

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7336d8cc-0a6b-4094-969a-61c8de3bb2f1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Mathias, you're correct in your observations. The only thing you might want
to be aware of is the terms filter is automatically cached, whereas an or
filter of many term filters is not automatically cached. But the results
you will get back should be the same.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4944f3bc-0ee5-4d07-ac27-801af53ed970%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Matthias Johnson) #3

Thanks Binh. Out of curiosity do you know of performance differences
between the 2 choices? it would seem that the terms filter with it's
caching may provide better performance?

@matthias

On Friday, January 31, 2014 10:34:50 AM UTC-7, Binh Ly wrote:

Mathias, you're correct in your observations. The only thing you might
want to be aware of is the terms filter is automatically cached, whereas an
or filter of many term filters is not automatically cached. But the results
you will get back should be the same.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f2d6b901-d415-45fa-8687-4fb91a6d961b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #4

Matthias,

For the terms filter, let's say you do:

{
"query": {
"filtered": {
"filter": {
"terms": {
"a": [
"a",
"b"
]
}
}
}
}
}

If you run the above exact query many times, the succeeding calls will be
faster since the first call will cache the result of the above filter.

The or filter is not cached by default (but you can cache it explicitly if
needed) so if you run an or filter many times, it will re-evaluate for each
succeeding call (unless you explicitly cache it).

This might help also:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8a112f6b-d310-44f8-84a5-c9df65e6a9f1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Aniruddha) #5

As discussed here as well as what is stated in the documentation, the only difference is "caching".
So does this mean that if I cache the OR filter on a big list (30+) of individual term filters, it will not have any performance drawback compared to terms filter with these 30+ values?


(system) #6