Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.
NB: I'm working in Java using the Java client.
Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.
Hi,
I'd have a look at the search shards api, which given a search request
returns which shards (and nodes) it would get executed on, without actually
executing it. Have a look at issue #2726https://github.com/elasticsearch/elasticsearch/issues/2726
.
On Sunday, February 16, 2014 11:09:01 PM UTC+1, Daniel Winterstein wrote:
Hello,
Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.
NB: I'm working in Java using the Java client.
Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.
Hi,
I'd have a look at the search shards api, which given a search request
returns which shards (and nodes) it would get executed on, without actually
executing it. Have a look at issue #2726 .
On Sunday, February 16, 2014 11:09:01 PM UTC+1, Daniel Winterstein wrote:
Hello,
Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.
NB: I'm working in Java using the Java client.
Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.