On Thu, Nov 8, 2012 at 1:22 AM, T Vinod Gupta firstname.lastname@example.org wrote:
im trying to explore how to increase my query speeds and in light of that, i
was going through all the routing related discussions..
- does routing always improve query speed? if i have 2 nodes (1 replica)
with 5 shards on an index and if i use routing, will it have a big impact?
I guess there's no getting away from testing, since it will also
depend on how your queries look like, but I suspect not.
wont it lead to shard hotspotting become counter productive? as in some set
of documents that have the same routing key have a high indexing rate
(and/or querying rate) and can have a negative impact?
You might run into this sort of issues if you choose an inappropriate
shard key. For example, a timestamp of a log message wouldn't be a
good candidate, since you'll always insert on the same shard, and
you're most likely to only query it as well.
But there are usecases where it helps. For example, if you have a
blogging site where users have a relatively balanced number of posts.
Then you can put the username as a routing value, which will make
filtering by user or searching posts of specific users really fast.
Take a look at this presentation, where you'll also get some
performance numbers on that particular scenario:
- if i have hundreds of thousands of existing documents that i want to add
a routing key to, then do i have to reindex them with a routing key?
Yes, as far as I know, you'll have to reindex.
http://sematext.com/ -- ElasticSearch -- Solr -- Lucene