Just pushed a breaking change (sorry!). Thought long and hard about this
one, and there is a flag to revert to the old behavior. Here are the
details:
Currently, single operation hashing (index/delete/get) is using the type as
part of the hashing to decide which shard to direct to. It make more sense
to just use the id for several reasons.
First, the new routing control capability will allow to direct docs to be
placed in the same placement of another doc (blog, and commends for example)
just based on that doc id (the routing when indexing a comment can use the
blog post id, and thats it).
There are future features where this type of hashing will really simplify
them, so it make sense to make this change now.
This change will require to reindex the data. In order to revert back to
using the type for hashing as well, set cluster.routing.operation.use_type
to true. This means that a cluster can be started by setting this flag,
start another cluster, and reindex data from the old cluster into the new
cluster.
So, we can custom route using the routing control capability to do whatever
we want...
Does that allow things like shard selection for a given query? I.e. route a
record to shards based on some sort of collection identifier, then on a
query remove shards from the list for those that should not be included to
resolve the query (i.e. don't query things that wouldn't contain any docs
anyway to improve performance, or as another example don't query things that
user doesn't have rights to by collection)
So, we can custom route using the routing control capability to do whatever
we want...
Does that allow things like shard selection for a given query? I.e. route
a
record to shards based on some sort of collection identifier, then on a
query remove shards from the list for those that should not be included to
resolve the query (i.e. don't query things that wouldn't contain any docs
anyway to improve performance, or as another example don't query things
that
user doesn't have rights to by collection)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.