How to optimize filters using better order and combinations (bool vs and/or/not)?


(Laurent T.) #1

Hi,

I'm currently working on optimizing a filtered query. It's basically a match_all query with over 40 filters.
I use mainly Bool Filters that encapsulate Terms Filters, Missing Filters, Not Filters (applied to Terms Filters) and some Range Filters and Geo* Filters (namely Geohash Cell Filter and GeoShape Filter).

I found the article All About Elasticsearch Filter BitSets that talks about how one should be using Boolean Filters vs AND/OR/NOT filters but it's marked as being outdated.
I've also been reading about Filter Order

I realized my 40 filters weren't organized at all and I want to optimize the order and the use of bitsets and cache.

I'm quite sure I understood how order impacts everything but concerning using Bool vs AND/OR/NOT, is that article still accurate ? How should I combine my various filters ?

I was thinking of starting from the top with a global AND filter and the put a global boolean filter in it followed by my range and geo filters. But what if i need to combine range filters deeply inside boolean filters ? Should i just do bool.must(and(range,range),terms,terms) ?

Also I do alot of those: bool.should(not(terms),missing). Should I be using a mustNot bool filter instead of a notFilter inside my should ?

One last thing: the outdated link i referenced above gives a final example that has a range query inside the bool query between two terms query. Isn't that contradictory to what is explained in the article ?

Thanks for your advices.

Cheers

Note: I'm currently using ES 1.5 but we'll be moving to the latest version soon.


(Laurent T.) #2

I've done the reordering of my filters but i still don't know how I should manage my range and geo filters vs my boolean filters. Should i be using AND/OR/NOT ? Should i replace my NOT filters by boolean.mustNot filters ?

Thanks


(system) #3