Percollation limits

Maciej_Dziardziel · June 13, 2014, 3:15pm

Hi

I wanted to ask those who use percollation: how many queries are you
percollating?

I need to set up some equivalent of percollation for about 100k queries.
With some filtering
probably up to 10k would actually had to be checked for each new document.
Is the idea of using ES percollations for that insane?

Thanks
Maciej Dziardziel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bf587216-9630-4eed-b30f-7f6a869778ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

javanna · June 16, 2014, 1:19pm

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper
filtering as much as you can to limit the number of queries you execute
when percolating each document.
Also, with the percolator available since 1.0 you can scale out just by
adding more nodes and have the percolator queries distributed over multiple
shards. That means that if you were to reach the limit of a single shard
you could always scale out.

On Friday, June 13, 2014 5:15:05 PM UTC+2, Maciej Dziardziel wrote:

Hi

I wanted to ask those who use percollation: how many queries are you
percollating?

I need to set up some equivalent of percollation for about 100k queries.
With some filtering
probably up to 10k would actually had to be checked for each new document.
Is the idea of using ES percollations for that insane?

Thanks
Maciej Dziardziel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e81b15ba-f9c0-4ffa-bbb8-f644823c8367%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ralph_Meijer · June 16, 2014, 2:53pm

On Mon, Jun 16, 2014 at 06:19:23AM -0700, Luca Cavanna wrote:

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper filtering
as much as you can to limit the number of queries you execute when percolating
each document.
Also, with the percolator available since 1.0 you can scale out just by adding
more nodes and have the percolator queries distributed over multiple shards.
That means that if you were to reach the limit of a single shard you could
always scale out.

If I remember correctly from Martijn's presentation, each percolator
query is matched against the document sequentially. Are there
plans for using commonality between queries to do more efficient
matching, maybe using decision trees or somesuch?

--
Cheers,

ralphm

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20140616145342.GA25839%40ik.nu.
For more options, visit https://groups.google.com/d/optout.

javanna · June 16, 2014, 2:57pm

It is something we might look into but there's no concrete plan for now. On
the other hand metadata filtering (or eventually routing) allows to reduce
the number of queries that need to be run, only problem is that users need
to do it themselves.

On Mon, Jun 16, 2014 at 4:53 PM, Ralph Meijer ralphm@ik.nu wrote:

On Mon, Jun 16, 2014 at 06:19:23AM -0700, Luca Cavanna wrote:

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper
filtering
as much as you can to limit the number of queries you execute when
percolating
each document.
Also, with the percolator available since 1.0 you can scale out just by
adding
more nodes and have the percolator queries distributed over multiple
shards.
That means that if you were to reach the limit of a single shard you
could
always scale out.

If I remember correctly from Martijn's presentation, each percolator
query is matched against the document sequentially. Are there
plans for using commonality between queries to do more efficient
matching, maybe using decision trees or somesuch?

--
Cheers,

ralphm

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BQbBFTyx31g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/20140616145342.GA25839%40ik.nu
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADdZ9MXrQyjNpfT%3D8HcvVtROduhqUtu8-CQOX6DGUPFBOx%3DPuA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Maciej_Dziardziel · June 17, 2014, 7:45am

Thanks for reply. I did some early testing and I am getting about 0.7-1.4s
to get results, (that's without any filtering yet), which is still within
acceptable range for me.
I'd still like to hear about people experience with it. It seems this is
very rarely used feature.

On Monday, June 16, 2014 2:19:24 PM UTC+1, Luca Cavanna wrote:

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper
filtering as much as you can to limit the number of queries you execute
when percolating each document.
Also, with the percolator available since 1.0 you can scale out just by
adding more nodes and have the percolator queries distributed over multiple
shards. That means that if you were to reach the limit of a single shard
you could always scale out.

On Friday, June 13, 2014 5:15:05 PM UTC+2, Maciej Dziardziel wrote:

Hi

I wanted to ask those who use percollation: how many queries are you
percollating?

I need to set up some equivalent of percollation for about 100k queries.
With some filtering
probably up to 10k would actually had to be checked for each new document.
Is the idea of using ES percollations for that insane?

Thanks
Maciej Dziardziel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/89f40dd8-9c70-4015-bb69-c127ada8551d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Percolator performance ideas Elasticsearch	6	529	July 6, 2017
Percolate queries and node cpu usage Elasticsearch	1	509	July 5, 2017
Is Bulk Percolation possible? Elasticsearch	3	822	July 6, 2017
Percolator performance Elasticsearch	18	4760	July 5, 2017
Percolator issue, matching queries Elasticsearch	1	196	January 6, 2023

Percollation limits

Related topics