Filtered query vs query performance


(Paweł Młynarczyk) #1

Hello

I have a parent-child index in my app (mapping gist:
https://gist.github.com/zwrss/9953291#file-mapping). I have about 250.000
parents and 7.000.000 childrens.
I was trying to speed up my top_children query (gist:
https://gist.github.com/zwrss/9953291#file-parent-query) a bit, so I tried
adding some filters to ensure that the query have to score only about 300
parents (gist:
https://gist.github.com/zwrss/9953291#file-filtered-parent-query).
The effect was surprising - the original query was executing in ~150 ms and
the filtered one in ~270 ms.
I thought that adding some filters to ease the scoring process would help.
I thought that te top_children is the thing here so I decided to do some
testing and tried to query only Children and add some filters to narrow the
results.
The original query I tried (gist:
https://gist.github.com/zwrss/9953291#file-child-query) executed in ~90 ms
and the filtered one (gist:
https://gist.github.com/zwrss/9953291#file-filtered-child-query) in ~120 ms.

Is that the correct behaviour? Am I missing something?

best regards
Paweł Młynarczyk

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e3cdf588-a0f2-4911-9143-59dd926469c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Paweł Młynarczyk) #2

Update:
Filtered Aliases allows me to speed up the query (about 50%). The thing is
I can't use aliases since the filter I am applying is not static. Why does
filtered alias perform so much better then filtered query?

W dniu czwartek, 3 kwietnia 2014 14:44:37 UTC+2 użytkownik Paweł Młynarczyk
napisał:

Hello

I have a parent-child index in my app (mapping gist:
https://gist.github.com/zwrss/9953291#file-mapping). I have about 250.000
parents and 7.000.000 childrens.
I was trying to speed up my top_children query (gist:
https://gist.github.com/zwrss/9953291#file-parent-query) a bit, so I
tried adding some filters to ensure that the query have to score only about
300 parents (gist:
https://gist.github.com/zwrss/9953291#file-filtered-parent-query).
The effect was surprising - the original query was executing in ~150 ms
and the filtered one in ~270 ms.
I thought that adding some filters to ease the scoring process would help.
I thought that te top_children is the thing here so I decided to do some
testing and tried to query only Children and add some filters to narrow the
results.
The original query I tried (gist:
https://gist.github.com/zwrss/9953291#file-child-query) executed in ~90
ms and the filtered one (gist:
https://gist.github.com/zwrss/9953291#file-filtered-child-query) in ~120
ms.

Is that the correct behaviour? Am I missing something?

best regards
Paweł Młynarczyk

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ab9ae1dc-2d34-43d4-bd08-3919acd3abc8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Paweł Młynarczyk) #3

I've tried to query the children first and then do some ID filtering on
parents but the scoring is then screwed and the performance is even worse
(as expected). I still have not found any satisfying solution to this
matter.

W dniu czwartek, 3 kwietnia 2014 14:44:37 UTC+2 użytkownik Paweł Młynarczyk
napisał:

Hello

I have a parent-child index in my app (mapping gist:
https://gist.github.com/zwrss/9953291#file-mapping). I have about 250.000
parents and 7.000.000 childrens.
I was trying to speed up my top_children query (gist:
https://gist.github.com/zwrss/9953291#file-parent-query) a bit, so I
tried adding some filters to ensure that the query have to score only about
300 parents (gist:
https://gist.github.com/zwrss/9953291#file-filtered-parent-query).
The effect was surprising - the original query was executing in ~150 ms
and the filtered one in ~270 ms.
I thought that adding some filters to ease the scoring process would help.
I thought that te top_children is the thing here so I decided to do some
testing and tried to query only Children and add some filters to narrow the
results.
The original query I tried (gist:
https://gist.github.com/zwrss/9953291#file-child-query) executed in ~90
ms and the filtered one (gist:
https://gist.github.com/zwrss/9953291#file-filtered-child-query) in ~120
ms.

Is that the correct behaviour? Am I missing something?

best regards
Paweł Młynarczyk

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b117c316-8ab2-4618-b65c-32c77b42ecf1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4