Breaking Change: Change single operation shard hashing to only use id, and not id and type


(Shay Banon) #1

Hi,

Just pushed a breaking change (sorry!). Thought long and hard about this

one, and there is a flag to revert to the old behavior. Here are the
details:

Currently, single operation hashing (index/delete/get) is using the type as
part of the hashing to decide which shard to direct to. It make more sense
to just use the id for several reasons.

First, the new routing control capability will allow to direct docs to be
placed in the same placement of another doc (blog, and commends for example)
just based on that doc id (the routing when indexing a comment can use the
blog post id, and thats it).

There are future features where this type of hashing will really simplify
them, so it make sense to make this change now.

This change will require to reindex the data. In order to revert back to
using the type for hashing as well, set cluster.routing.operation.use_type
to true. This means that a cluster can be started by setting this flag,
start another cluster, and reindex data from the old cluster into the new
cluster.

-shay.banon


(jminard) #2

So, we can custom route using the routing control capability to do whatever
we want...

Does that allow things like shard selection for a given query? I.e. route a
record to shards based on some sort of collection identifier, then on a
query remove shards from the list for those that should not be included to
resolve the query (i.e. don't query things that wouldn't contain any docs
anyway to improve performance, or as another example don't query things that
user doesn't have rights to by collection)

View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Breaking-Change-Change-single-operation-shard-hashing-to-only-use-id-and-not-id-and-type-tp1833986p1860842.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Shay Banon) #3

Yes, you can pass a list of routing values when searching (comma separated).
Only shards that match that routing will be searched on.

On Mon, Nov 8, 2010 at 6:14 AM, jminard jayson.minard@gmail.com wrote:

So, we can custom route using the routing control capability to do whatever
we want...

Does that allow things like shard selection for a given query? I.e. route
a
record to shards based on some sort of collection identifier, then on a
query remove shards from the list for those that should not be included to
resolve the query (i.e. don't query things that wouldn't contain any docs
anyway to improve performance, or as another example don't query things
that
user doesn't have rights to by collection)

View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Breaking-Change-Change-single-operation-shard-hashing-to-only-use-id-and-not-id-and-type-tp1833986p1860842.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #4