Hello All,
I have been experimenting with percolator to prototype a near real-time
alerts app and it works great. The kind of queries that I register are a
combination of term, geo-polygon, geo-distance and range queries. Today I
started tinkering with bringing up several nodes (albeit on the same
machine in one cluster), and percolator queries are replicated on all the
nodes. Here are some questions I have and would really appreciate some
advice/comments.
-
During production I will have b/w 2.5million - 3million registered
queries on one index. Each query document will have a combination of
filters for terms, geo-distance/poly, date range, long/int range
(approximately 5-15 filters).
Since all the data is stored in the percolator index when I percolate a
document for a specific index ( <index_name>//_percolate -d
'{....}'), I am assuming ES automagically will be considering only the
<index_name> stored queries to percolate the document against? as this
would be one of the reasons to create an index to begin with correct?.
Since the same query set is replicated on all nodes in the percolator index
in my limited observation, is there a way to shard the percolator index
across nodes?, as ideally behind the scenes in ES I would like to percolate
the document across different percolator index shards and finally get
merged matches. If the percolator index can not be sharded what are the
alternatives? Can someone share their experience with dealing with a
similar or higher volume of percolator queries (including performance). -
Given the above scenario what kind of hardware resources can one expect
to be in place (ideally VM's would be what I would have access to).
Thank you in advance
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.