Which node(s) hold a document?


(Daniel Winterstein) #1

Hello,

Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.

NB: I'm working in Java using the Java client.

Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.

Thank you for any help,

  • Daniel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6ade77fb-3752-4db6-b678-e23c7118de08%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #2

Hi,
I'd have a look at the search shards api, which given a search request
returns which shards (and nodes) it would get executed on, without actually
executing it. Have a look at issue #2726https://github.com/elasticsearch/elasticsearch/issues/2726
.

On Sunday, February 16, 2014 11:09:01 PM UTC+1, Daniel Winterstein wrote:

Hello,

Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.

NB: I'm working in Java using the Java client.

Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.

Thank you for any help,

  • Daniel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/42962441-8178-48b2-92c3-4f92da1721f2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Daniel Winterstein) #3

Thanks Luca -- that should do the job.

The cat shards endpoint also looks relevant:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cat-shards.html

Best regards,

  • Daniel

On 17 February 2014 08:43, Luca Cavanna cavannaluca@gmail.com wrote:

Hi,
I'd have a look at the search shards api, which given a search request
returns which shards (and nodes) it would get executed on, without actually
executing it. Have a look at issue #2726 .

On Sunday, February 16, 2014 11:09:01 PM UTC+1, Daniel Winterstein wrote:

Hello,

Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.

NB: I'm working in Java using the Java client.

Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.

Thank you for any help,

  • Daniel

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XEJECqcdJhU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/42962441-8178-48b2-92c3-4f92da1721f2%40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Dr Daniel Winterstein
Director

A: TechCube, Edinburgh, EH9 1PL
M: +44 (0)772 5172 612
http://winterwell.com http://sodash.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEmLStmH6-zvbHvhkE_nuBLRUc7yoyn0SpDUCyFtcd0OW-eivw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4