Questions regarding sharding and upcoming local gateway

Diptamay · September 9, 2010, 6:05am

Hi Shay

I have a few questions regarding sharding in ES:

How is sharding handled in ES? Basically, how is a shard key
determined and how are the documents evenly distributed across various
shards?
How do I specifically say which shard to replicate where? Lets say
for eg, I have 3 nodes/boxes running ES, with 2 replicas and 3 shards
So I want:
a) Server A -> shard 1, shard 2
b) Server B -> shard 2, shard 3
c) Server C -> shard 1, shard 3
How do I setup this configuration?
When I add another node say Server D to the above, and change the
number of shards to 4 and replicas to 3, basically saying this is what
I want to be my new configuration:
a) Server A -> shard 1, shard 2, shard 4
b) Server B -> shard 2, shard 3, shard 1
c) Server C -> shard 1, shard 3, shard 4
d) Server D -> shard 2, shard 3, shard 4
Would the data get re-balanced across shards or will it happen
incrementally over time?

Regarding the gateway (this is more of a clarification), as of now
with the shared fs folder (nfs on unix boxes), I understand all
servers in the cluster write to the same shared folder on nfs. This
definitely makes backing up of indexes easy. Now with the new upcoming
local gateway, can I still configure the folders? The shards to be
written:
a) Server A -> shard 1 (folder /data/shardA1), shard 2 (folder /data/
shardA2)
b) Server B -> shard 2 (folder /data/shardB2), shard 3 (folder /data/
shardB3)
c) Server C -> shard 1 (folder /data/shardC1), shard 3 (folder /data/
shardC3)
If I can, how would I set it up? If not this would be highly
desirable, since I would be able to go back to a backed up shard info
just in case a specific shard gets corrupted.

Thanks for your time.

Regards
Dipamay

kimchy · September 9, 2010, 9:20am

Hi,

On Thu, Sep 9, 2010 at 9:05 AM, diptamay diptamay@gmail.com wrote:

Hi Shay

I have a few questions regarding sharding in ES:

How is sharding handled in ES? Basically, how is a shard key
determined and how are the documents evenly distributed across various
shards?

The document _id is hashed into a specific shard. Basically its a simple
hash(_id) MOD number_of_shards.

How do I specifically say which shard to replicate where? Lets say
for eg, I have 3 nodes/boxes running ES, with 2 replicas and 3 shards
So I want:
a) Server A -> shard 1, shard 2
b) Server B -> shard 2, shard 3
c) Server C -> shard 1, shard 3
How do I setup this configuration?

Shards gets automatically allocated to nodes. The allocation aims at
creating an evenly balanced number of shards per node across the cluster.

When I add another node say Server D to the above, and change the
number of shards to 4 and replicas to 3, basically saying this is what
I want to be my new configuration:
a) Server A -> shard 1, shard 2, shard 4
b) Server B -> shard 2, shard 3, shard 1
c) Server C -> shard 1, shard 3, shard 4
d) Server D -> shard 2, shard 3, shard 4
Would the data get re-balanced across shards or will it happen
incrementally over time?

You can't change the number_of_shards for an existing index. You can change
the number_of_replicas though. Data, as in concrete documents don't get
rebalanced. Shards do get rebalanced. For example, if you create an index
with 2 shards and 1 replica, and start a single node, then 2 shards will be
allocated to it. If you start another node, then 2 more shards will be
allocated to it (the replicas), and then adding more nodes will start to
cause the shards to get rebalanced across nodes. In the mentioned case, you
cap at 4 nodes, this means that if you start the 5th node, nothing will
happen. Note that indices can be created dynamically, and the
number_of_shards are set on the index level.

Regarding the gateway (this is more of a clarification), as of now
with the shared fs folder (nfs on unix boxes), I understand all
servers in the cluster write to the same shared folder on nfs. This
definitely makes backing up of indexes easy. Now with the new upcoming
local gateway, can I still configure the folders? The shards to be
written:
a) Server A -> shard 1 (folder /data/shardA1), shard 2 (folder /data/
shardA2)
b) Server B -> shard 2 (folder /data/shardB2), shard 3 (folder /data/
shardB3)
c) Server C -> shard 1 (folder /data/shardC1), shard 3 (folder /data/
shardC3)
If I can, how would I set it up? If not this would be highly
desirable, since I would be able to go back to a backed up shard info
just in case a specific shard gets corrupted.

The local gateway basically means that recovery will reuse what each node
has locally. Shards get allocated and deallocated from nodes on the fly as
nodes come and go, so "static" is not possible. Also, the idea is that you
would have the "work" directory of elasticsearch on a local drive, and not
NFS mounted, so it will work faster. With local gateway, if you are
concerned that a shard will get corrupted, instead of creating a backup, you
should increase the number of replicas.

Thanks for your time.

Regards
Dipamay

Diptamay · September 9, 2010, 8:33pm

Thanks for the explanations.

Had one last question for now. Is there a way to elect a master or
rather increase the chances of one being a master? The reason I am
asking is, I have 4 servers A,B,C,D. Out of the 4, A & C are twice as
more powerful than C & D. So when I start up, starting B first makes
it the master. I would ideally prefer A/C to be the master, if either
of them are up and running, rather than in which sequence I started
the servers.

Thanks
Diptamay

On Sep 9, 5:20 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

On Thu, Sep 9, 2010 at 9:05 AM, diptamay dipta...@gmail.com wrote:

Hi Shay

I have a few questions regarding sharding in ES:

How is sharding handled in ES? Basically, how is a shard key
determined and how are the documents evenly distributed across various
shards?

The document _id is hashed into a specific shard. Basically its a simple
hash(_id) MOD number_of_shards.

How do I specifically say which shard to replicate where? Lets say
for eg, I have 3 nodes/boxes running ES, with 2 replicas and 3 shards
So I want:
a) Server A -> shard 1, shard 2
b) Server B -> shard 2, shard 3
c) Server C -> shard 1, shard 3
How do I setup this configuration?

Shards gets automatically allocated to nodes. The allocation aims at
creating an evenly balanced number of shards per node across the cluster.

When I add another node say Server D to the above, and change the
number of shards to 4 and replicas to 3, basically saying this is what
I want to be my new configuration:
a) Server A -> shard 1, shard 2, shard 4
b) Server B -> shard 2, shard 3, shard 1
c) Server C -> shard 1, shard 3, shard 4
d) Server D -> shard 2, shard 3, shard 4
Would the data get re-balanced across shards or will it happen
incrementally over time?

You can't change the number_of_shards for an existing index. You can change
the number_of_replicas though. Data, as in concrete documents don't get
rebalanced. Shards do get rebalanced. For example, if you create an index
with 2 shards and 1 replica, and start a single node, then 2 shards will be
allocated to it. If you start another node, then 2 more shards will be
allocated to it (the replicas), and then adding more nodes will start to
cause the shards to get rebalanced across nodes. In the mentioned case, you
cap at 4 nodes, this means that if you start the 5th node, nothing will
happen. Note that indices can be created dynamically, and the
number_of_shards are set on the index level.

Regarding the gateway (this is more of a clarification), as of now
with the shared fs folder (nfs on unix boxes), I understand all
servers in the cluster write to the same shared folder on nfs. This
definitely makes backing up of indexes easy. Now with the new upcoming
local gateway, can I still configure the folders? The shards to be
written:
a) Server A -> shard 1 (folder /data/shardA1), shard 2 (folder /data/
shardA2)
b) Server B -> shard 2 (folder /data/shardB2), shard 3 (folder /data/
shardB3)
c) Server C -> shard 1 (folder /data/shardC1), shard 3 (folder /data/
shardC3)
If I can, how would I set it up? If not this would be highly
desirable, since I would be able to go back to a backed up shard info
just in case a specific shard gets corrupted.

The local gateway basically means that recovery will reuse what each node
has locally. Shards get allocated and deallocated from nodes on the fly as
nodes come and go, so "static" is not possible. Also, the idea is that you
would have the "work" directory of elasticsearch on a local drive, and not
NFS mounted, so it will work faster. With local gateway, if you are
concerned that a shard will get corrupted, instead of creating a backup, you
should increase the number of replicas.

Thanks for your time.

Regards
Dipamay

kimchy · September 9, 2010, 9:00pm

This requirements stems from the assumption that the master requires more
computing power than another node, while its not really the case. The extra
work the master needs to do is negligible. You can configure so that only
specific nodes can become master (similar to data nodes), but thats not what
you are after.

On Thu, Sep 9, 2010 at 11:33 PM, diptamay diptamay@gmail.com wrote:

Thanks for the explanations.

Had one last question for now. Is there a way to elect a master or
rather increase the chances of one being a master? The reason I am
asking is, I have 4 servers A,B,C,D. Out of the 4, A & C are twice as
more powerful than C & D. So when I start up, starting B first makes
it the master. I would ideally prefer A/C to be the master, if either
of them are up and running, rather than in which sequence I started
the servers.

Thanks
Diptamay

On Sep 9, 5:20 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

On Thu, Sep 9, 2010 at 9:05 AM, diptamay dipta...@gmail.com wrote:

Hi Shay

I have a few questions regarding sharding in ES:

How is sharding handled in ES? Basically, how is a shard key
determined and how are the documents evenly distributed across various
shards?

The document _id is hashed into a specific shard. Basically its a simple
hash(_id) MOD number_of_shards.

How do I specifically say which shard to replicate where? Lets say
for eg, I have 3 nodes/boxes running ES, with 2 replicas and 3 shards
So I want:
a) Server A -> shard 1, shard 2
b) Server B -> shard 2, shard 3
c) Server C -> shard 1, shard 3
How do I setup this configuration?

Shards gets automatically allocated to nodes. The allocation aims at
creating an evenly balanced number of shards per node across the cluster.

When I add another node say Server D to the above, and change the
number of shards to 4 and replicas to 3, basically saying this is what
I want to be my new configuration:
a) Server A -> shard 1, shard 2, shard 4
b) Server B -> shard 2, shard 3, shard 1
c) Server C -> shard 1, shard 3, shard 4
d) Server D -> shard 2, shard 3, shard 4
Would the data get re-balanced across shards or will it happen
incrementally over time?

You can't change the number_of_shards for an existing index. You can
change
the number_of_replicas though. Data, as in concrete documents don't get
rebalanced. Shards do get rebalanced. For example, if you create an index
with 2 shards and 1 replica, and start a single node, then 2 shards will
be
allocated to it. If you start another node, then 2 more shards will be
allocated to it (the replicas), and then adding more nodes will start to
cause the shards to get rebalanced across nodes. In the mentioned case,
you
cap at 4 nodes, this means that if you start the 5th node, nothing will
happen. Note that indices can be created dynamically, and the
number_of_shards are set on the index level.

Regarding the gateway (this is more of a clarification), as of now
with the shared fs folder (nfs on unix boxes), I understand all
servers in the cluster write to the same shared folder on nfs. This
definitely makes backing up of indexes easy. Now with the new upcoming
local gateway, can I still configure the folders? The shards to be
written:
a) Server A -> shard 1 (folder /data/shardA1), shard 2 (folder /data/
shardA2)
b) Server B -> shard 2 (folder /data/shardB2), shard 3 (folder /data/
shardB3)
c) Server C -> shard 1 (folder /data/shardC1), shard 3 (folder /data/
shardC3)
If I can, how would I set it up? If not this would be highly
desirable, since I would be able to go back to a backed up shard info
just in case a specific shard gets corrupted.

The local gateway basically means that recovery will reuse what each node
has locally. Shards get allocated and deallocated from nodes on the fly
as
nodes come and go, so "static" is not possible. Also, the idea is that
you
would have the "work" directory of elasticsearch on a local drive, and
not
NFS mounted, so it will work faster. With local gateway, if you are
concerned that a shard will get corrupted, instead of creating a backup,
you
should increase the number of replicas.

Thanks for your time.

Regards
Dipamay

Diptamay · September 9, 2010, 10:38pm

Thanks for the info.

-Diptamay

Topic		Replies	Views
Few queries on setting up a high performing and scalable ES setup Elasticsearch	3	327	July 6, 2017
Fundamental question about ES data/shards Elasticsearch	3	417	July 6, 2017
Choosing Shards and Replica's configuration values Elasticsearch	11	674	July 6, 2017
ES shards allocation logic Elasticsearch	3	999	January 2, 2018
Node Local Storage and Gateway Storage Elasticsearch	2	1112	July 6, 2017

Questions regarding sharding and upcoming local gateway

Related topics