Bool query with filter is slower than filtered query

eprst · April 12, 2016, 5:04pm

we tested more like this query inside filtered query with some filters. but filtered query is depricated. and that's why we transform it bool query, we put more_like_this inside must clause and all filters inside filter. Result counts are the same but the new query is ~20% slower than old one. how can it be possible if filtered query should be transfromed to bool internally?

eprst · April 14, 2016, 1:31am

when i put the filters inside another bool.must (
bool:{
filter:{
bool:{
must:[{filter1},{filter2]
}
},
must{mlt}} )
it works with the same perfomance as filtered query. Can anyone explain why should i do this.

jpountz · April 14, 2016, 1:36pm

Can you also provide us with the filtered query that you ran as well as the slow bool query.

eprst · April 14, 2016, 2:01pm

{"query": {"bool": {"filter": [{"terms": {"book_id": [some ids here]}}, {"term": {"not_available": false}}, {"bool": {"should": [{"exists": {"field": "link"}}, {"exists": {"field": "ISBN"}}]}}], "must": {"more_like_this": {"fields": ["text_field"], "like": [{"_type": "chapters", "_id": id}]}}}}}

{"query": {"filtered": {"filter": {"and": [{"terms": {"not_available": ["false"]}}, {"bool": {"should": [{"exists": {"field": "link"}}, {"exists": {"field": "ISBN"}}]}}, {"terms": {"book_id": [some ids here]}}]}, "query": {"more_like_this": {"fields": ["text_field"], "like": [{"_type": "chapters", "_id": id}]}}}}}

jpountz · April 14, 2016, 4:41pm

I am also confused why you are seenig different response times. Could you pass these queries to the _validate/query API (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-validate.html) with ?rewrite=true and share the output?

eprst · April 14, 2016, 6:43pm

new
"explanations":[{"index":"index1","valid":true,"explanation":"+(+(((text_field:level text_field:aa text_field:c18 text_field:linol text_field:tabl text_field:c20 text_field:c22 text_field:australian text_field:linolen text_field:dha text_field:powder text_field:oil text_field:liquid text_field:181 text_field:lcp text_field:pufa text_field:la text_field:infant text_field:2c text_field:isom text_field:acid text_field:fatti text_field:lna text_field:tran text_field:formula)~7) -ConstantScore(_uid:chapter#7ab02ae1-063e-4e8d-bd13-19cae103e8b5)) #ConstantScore(book_id: \u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0001 book_id: \u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0002) #not_available:F #((ConstantScore(_field_names:link) ConstantScore(_field_names:ISBN))~1)) #ConstantScore(_type:chapter)"}]}

old
"explanations":[{"index":"index1","valid":true,"explanation":"+(+(((text_field:level text_field:aa text_field:c18 text_field:linol text_field:tabl text_field:c20 text_field:c22 text_field:australian text_field:linolen text_field:dha text_field:powder text_field:oil text_field:liquid text_field:181 text_field:lcp text_field:pufa text_field:la text_field:infant text_field:2c text_field:isom text_field:acid text_field:fatti text_field:lna text_field:tran text_field:formula)~7) -ConstantScore(_uid:chapter#7ab02ae1-063e-4e8d-bd13-19cae103e8b5)) #(+ConstantScore(not_available:F) +((ConstantScore(_field_names:link) ConstantScore(_field_names:ISBN))~1) +ConstantScore(book_id: \u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0001 book_id: \u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0002))) #ConstantScore(_type:chapter)"}]}

jpountz · April 18, 2016, 9:14am

This still looks very similar. The main difference is that new puts filters on the top level while old puts them in a nested filter clause. Something that is odd is that the new query uses the notavailable field while the old one uses not_available. Could it be a copy-pasting issue?

I'm not sure there is anything we can fix here. I suspect that for some reason the old query is more friendly to the JVM.

eprst · April 18, 2016, 9:44am

yes it is copy-pasting issue, and as i wrote before when wrap all filter clauses in another bool. must, it is working faster. should i try to construct new query as similar to old one as possible? Or should i use old?

jpountz · April 18, 2016, 1:26pm

If this works consistently better for you, then you can wrap filter clauses in a nested boolean query like the old one did.

Topic		Replies	Views
Filter vs bool Query Performance , getting bad performance on using Filters Elasticsearch	7	2829	June 6, 2018
Must_not in bool filter much slower than must for same terms filter Elasticsearch	14	9588	July 6, 2017
Bool query VS filtered query with query filter Elasticsearch	2	810	July 6, 2017
Performance difference using bool with filter Elasticsearch	11	2235	July 5, 2017
Slow bool query Elasticsearch	3	588	May 4, 2018

Bool query with filter is slower than filtered query

Related topics