Percollation limits

Hi

I wanted to ask those who use percollation: how many queries are you
percollating?

I need to set up some equivalent of percollation for about 100k queries.
With some filtering
probably up to 10k would actually had to be checked for each new document.
Is the idea of using ES percollations for that insane?

Thanks
Maciej Dziardziel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bf587216-9630-4eed-b30f-7f6a869778ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper
filtering as much as you can to limit the number of queries you execute
when percolating each document.
Also, with the percolator available since 1.0 you can scale out just by
adding more nodes and have the percolator queries distributed over multiple
shards. That means that if you were to reach the limit of a single shard
you could always scale out.

On Friday, June 13, 2014 5:15:05 PM UTC+2, Maciej Dziardziel wrote:

Hi

I wanted to ask those who use percollation: how many queries are you
percollating?

I need to set up some equivalent of percollation for about 100k queries.
With some filtering
probably up to 10k would actually had to be checked for each new document.
Is the idea of using ES percollations for that insane?

Thanks
Maciej Dziardziel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e81b15ba-f9c0-4ffa-bbb8-f644823c8367%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

On Mon, Jun 16, 2014 at 06:19:23AM -0700, Luca Cavanna wrote:

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper filtering
as much as you can to limit the number of queries you execute when percolating
each document.
Also, with the percolator available since 1.0 you can scale out just by adding
more nodes and have the percolator queries distributed over multiple shards.
That means that if you were to reach the limit of a single shard you could
always scale out.

If I remember correctly from Martijn's presentation, each percolator
query is matched against the document sequentially. Are there
plans for using commonality between queries to do more efficient
matching, maybe using decision trees or somesuch?

--
Cheers,

ralphm

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20140616145342.GA25839%40ik.nu.
For more options, visit https://groups.google.com/d/optout.

It is something we might look into but there's no concrete plan for now. On
the other hand metadata filtering (or eventually routing) allows to reduce
the number of queries that need to be run, only problem is that users need
to do it themselves.

On Mon, Jun 16, 2014 at 4:53 PM, Ralph Meijer ralphm@ik.nu wrote:

On Mon, Jun 16, 2014 at 06:19:23AM -0700, Luca Cavanna wrote:

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper
filtering
as much as you can to limit the number of queries you execute when
percolating
each document.
Also, with the percolator available since 1.0 you can scale out just by
adding
more nodes and have the percolator queries distributed over multiple
shards.
That means that if you were to reach the limit of a single shard you
could
always scale out.

If I remember correctly from Martijn's presentation, each percolator
query is matched against the document sequentially. Are there
plans for using commonality between queries to do more efficient
matching, maybe using decision trees or somesuch?

--
Cheers,

ralphm

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BQbBFTyx31g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/20140616145342.GA25839%40ik.nu
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADdZ9MXrQyjNpfT%3D8HcvVtROduhqUtu8-CQOX6DGUPFBOx%3DPuA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for reply. I did some early testing and I am getting about 0.7-1.4s
to get results, (that's without any filtering yet), which is still within
acceptable range for me.
I'd still like to hear about people experience with it. It seems this is
very rarely used feature.

On Monday, June 16, 2014 2:19:24 PM UTC+1, Luca Cavanna wrote:

Hi Maciej,
what you describe doesn't sound insane, just make sure you use proper
filtering as much as you can to limit the number of queries you execute
when percolating each document.
Also, with the percolator available since 1.0 you can scale out just by
adding more nodes and have the percolator queries distributed over multiple
shards. That means that if you were to reach the limit of a single shard
you could always scale out.

On Friday, June 13, 2014 5:15:05 PM UTC+2, Maciej Dziardziel wrote:

Hi

I wanted to ask those who use percollation: how many queries are you
percollating?

I need to set up some equivalent of percollation for about 100k queries.
With some filtering
probably up to 10k would actually had to be checked for each new document.
Is the idea of using ES percollations for that insane?

Thanks
Maciej Dziardziel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/89f40dd8-9c70-4015-bb69-c127ada8551d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.