Which node(s) hold a document?

Daniel_Winterstein · February 16, 2014, 10:09pm

Hello,

Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.

NB: I'm working in Java using the Java client.

Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.

Thank you for any help,

Daniel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6ade77fb-3752-4db6-b678-e23c7118de08%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

javanna · February 17, 2014, 8:43am

Hi,
I'd have a look at the search shards api, which given a search request
returns which shards (and nodes) it would get executed on, without actually
executing it. Have a look at issue #2726https://github.com/elasticsearch/elasticsearch/issues/2726
.

On Sunday, February 16, 2014 11:09:01 PM UTC+1, Daniel Winterstein wrote:

Hello,

Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.

NB: I'm working in Java using the Java client.

Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.

Thank you for any help,

Daniel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/42962441-8178-48b2-92c3-4f92da1721f2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Daniel_Winterstein · February 18, 2014, 9:50am

Thanks Luca -- that should do the job.

The cat shards endpoint also looks relevant:

Best regards,

Daniel

On 17 February 2014 08:43, Luca Cavanna cavannaluca@gmail.com wrote:

Hi,
I'd have a look at the search shards api, which given a search request
returns which shards (and nodes) it would get executed on, without actually
executing it. Have a look at issue #2726 .

On Sunday, February 16, 2014 11:09:01 PM UTC+1, Daniel Winterstein wrote:

Hello,

Is it possible to find out (a) which logical shard a document is in, and
(b) which servers are holding that shard?
These are documents with parents, so setRouting() is used when indexing.

NB: I'm working in Java using the Java client.

Why?
I want to do distributed computation, where the processing for a document
happens on the server holding the data.
I'm using a parent-child routing scheme, which means that large blocks of
related data will end up in the same shard.
Shifting computation to be near data, rather than vice-versa, would make a
significant difference.
Also, I'm thinking the holding servers can be used as part of coordinating
the distribution of jobs.

Thank you for any help,

Daniel

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XEJECqcdJhU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/42962441-8178-48b2-92c3-4f92da1721f2%40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Dr Daniel Winterstein
Director

A: TechCube, Edinburgh, EH9 1PL
M: +44 (0)772 5172 612
http://winterwell.com http://sodash.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEmLStmH6-zvbHvhkE_nuBLRUc7yoyn0SpDUCyFtcd0OW-eivw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Get Shard Info From Cluster/Nodes/Index Elasticsearch	3	410	July 6, 2017
Docs about sharding and scatter/gather Elasticsearch	5	1946	July 6, 2017
Is there an easy way to get the shard of a document? Elasticsearch	17	1586	August 22, 2022
Where document is analyzed and processed Elasticsearch	4	518	July 6, 2017
Data distribution over shards and replicas Elasticsearch	6	1103	July 6, 2017

Which node(s) hold a document?

Related topics