Manuall re-balancing shard

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly
one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

If you don't have replica, shards will be balanced automatically and you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate
between nodes.

I would like the allocation of shards in below ways, so that the load
between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and you
will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve
exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'
See Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitjaincs06@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate between nodes.

I would like the allocation of shards in below ways, so that the load between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:
If you don't have replica, shards will be balanced automatically and you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then, ES
will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of
rebalancing on cluster. After rebalancing, Is it possible that the
multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'

See Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to distribute
things equally.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate
between nodes.

I would like the allocation of shards in below ways, so that the load
between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and you
will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve
exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

If you have 5 primary shards, no replica, adding a 6th node won't do anything.

For all that questions, here is what I did in the past when I started to play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitjaincs06@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then, ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of rebalancing on cluster. After rebalancing, Is it possible that the multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:
so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'
See Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate between nodes.

I would like the allocation of shards in below ways, so that the load between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:
If you don't have replica, shards will be balanced automatically and you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then
created one index, then one shard will allocate to each node. After this, I
have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are
running). After this, when i have started 6th node, then some shards
automatically allocate to 6th node and also 6th node is serving more than
one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards of
one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:

If you have 5 primary shards, no replica, adding a 6th node won't do
anything.

For all that questions, here is what I did in the past when I started to
play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then, ES
will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of
rebalancing on cluster. After rebalancing, Is it possible that the
multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'

See
Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to
distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will
allocate between nodes.

I would like the allocation of shards in below ways, so that the load
between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and you
will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve
exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes. It's possible. But to me, it's not an issue.
BTW, Elasticsearch will ensure that replicas and primary are not on the same node for a given shard.

I'm wondering if you are trying to solve an issue you don't have but you are supposing that you will have.
If so, what is your concern?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 10:39, Ankit Jain ankitjaincs06@gmail.com a écrit :

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then created one index, then one shard will allocate to each node. After this, I have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are running). After this, when i have started 6th node, then some shards automatically allocate to 6th node and also 6th node is serving more than one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards of one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:
If you have 5 primary shards, no replica, adding a 6th node won't do anything.

For all that questions, here is what I did in the past when I started to play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitj...@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then, ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of rebalancing on cluster. After rebalancing, Is it possible that the multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:
so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'
See Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate between nodes.

I would like the allocation of shards in below ways, so that the load between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:
If you don't have replica, shards will be balanced automatically and you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,
Thanks for reply.

I am planning to deploy around 10 nodes of ES with replica=0. I would like
to create index on hour basis because we are getting millions records per
hour. Also, I would like to set the number of shards 10 for each index for
reducing the disk IO during searching.

Suppose, If we add new nodes in future, then it would rebalance the shards,
means It would increase the disk IO (multiple shards of one index on same
node) and can have impact on query performance.

Regards,
Ankit

On Thursday, 4 July 2013 15:09:00 UTC+5:30, David Pilato wrote:

Yes. It's possible. But to me, it's not an issue.
BTW, Elasticsearch will ensure that replicas and primary are not on the
same node for a given shard.

I'm wondering if you are trying to solve an issue you don't have but you
are supposing that you will have.
If so, what is your concern?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 10:39, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then
created one index, then one shard will allocate to each node. After this, I
have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are
running). After this, when i have started 6th node, then some shards
automatically allocate to 6th node and also 6th node is serving more than
one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards of
one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:

If you have 5 primary shards, no replica, adding a 6th node won't do
anything.

For all that questions, here is what I did in the past when I started to
play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitj...@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then,
ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of
rebalancing on cluster. After rebalancing, Is it possible that the
multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'

See
Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to
distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will
allocate between nodes.

I would like the allocation of shards in below ways, so that the load
between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and
you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve
exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I see. Thanks for the clarification.

What I would do is:

  • test how many docs you can have in one single shard without having bad response time for your use case.
  • if IO is your concern, buy/rent instances with SSD drives
  • test how many shards you can then have on a single box until time response become unacceptable for you

Then, decide about the number of nodes, shards...

Why no replica? I mean "will you have many requests / s?" In that case, I will add replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 juil. 2013 à 11:53, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi David,
Thanks for reply.

I am planning to deploy around 10 nodes of ES with replica=0. I would like to create index on hour basis because we are getting millions records per hour. Also, I would like to set the number of shards 10 for each index for reducing the disk IO during searching.

Suppose, If we add new nodes in future, then it would rebalance the shards, means It would increase the disk IO (multiple shards of one index on same node) and can have impact on query performance.

Regards,
Ankit

On Thursday, 4 July 2013 15:09:00 UTC+5:30, David Pilato wrote:

Yes. It's possible. But to me, it's not an issue.
BTW, Elasticsearch will ensure that replicas and primary are not on the same node for a given shard.

I'm wondering if you are trying to solve an issue you don't have but you are supposing that you will have.
If so, what is your concern?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 10:39, Ankit Jain ankitj...@gmail.com a écrit :

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then created one index, then one shard will allocate to each node. After this, I have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are running). After this, when i have started 6th node, then some shards automatically allocate to 6th node and also 6th node is serving more than one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards of one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:

If you have 5 primary shards, no replica, adding a 6th node won't do anything.

For all that questions, here is what I did in the past when I started to play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitj...@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then, ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of rebalancing on cluster. After rebalancing, Is it possible that the multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'
See Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate between nodes.

I would like the allocation of shards in below ways, so that the load between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

Thanks for the reply.

The number of records in one index is 40 millions and size of each record
is 10 KB.

We are planning to deploy 10 nodes of ES. Can you recommend the optimal
size of shards (5 or 10)?
Or
We will go though 20 ES nodes and create 20 shards per index.

Please suggest your view on the same.

Regards,
Ankit Jain

On Thursday, 4 July 2013 15:38:47 UTC+5:30, David Pilato wrote:

I see. Thanks for the clarification.

What I would do is:

  • test how many docs you can have in one single shard without having bad
    response time for your use case.
  • if IO is your concern, buy/rent instances with SSD drives
  • test how many shards you can then have on a single box until time
    response become unacceptable for you

Then, decide about the number of nodes, shards...

Why no replica? I mean "will you have many requests / s?" In that case, I
will add replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 juil. 2013 à 11:53, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi David,
Thanks for reply.

I am planning to deploy around 10 nodes of ES with replica=0. I would like
to create index on hour basis because we are getting millions records per
hour. Also, I would like to set the number of shards 10 for each index for
reducing the disk IO during searching.

Suppose, If we add new nodes in future, then it would rebalance the
shards, means It would increase the disk IO (multiple shards of one index
on same node) and can have impact on query performance.

Regards,
Ankit

On Thursday, 4 July 2013 15:09:00 UTC+5:30, David Pilato wrote:

Yes. It's possible. But to me, it's not an issue.
BTW, Elasticsearch will ensure that replicas and primary are not on the
same node for a given shard.

I'm wondering if you are trying to solve an issue you don't have but you
are supposing that you will have.
If so, what is your concern?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 10:39, Ankit Jain ankitj...@gmail.com a écrit :

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then
created one index, then one shard will allocate to each node. After this, I
have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are
running). After this, when i have started 6th node, then some shards
automatically allocate to 6th node and also 6th node is serving more than
one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards
of one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:

If you have 5 primary shards, no replica, adding a 6th node won't do
anything.

For all that questions, here is what I did in the past when I started to
play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitj...@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then,
ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of
rebalancing on cluster. After rebalancing, Is it possible that the
multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'

See
Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to
distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will
allocate between nodes.

I would like the allocation of shards in below ways, so that the load
between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and
you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve
exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

As I said, I don't know. The only realistic way to get that numbers is by testing!
It depends on many factors. One them is how long does a query must take?
It depends on your use case. I mean that if you need autocompletion feature, you are probably after less than 100ms response time.

A common practice is to set the number of shards per node equals to the number of cores.
So if you have 8 cores, you can set 8 shards per node. That means that with your 10 nodes, you can have 80 shards. If you have replica=1, that means you can define one index with 40 shards or 2 index with 20 shards each…

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 13:44, Ankit Jain ankitjaincs06@gmail.com a écrit :

Hi David,

Thanks for the reply.

The number of records in one index is 40 millions and size of each record is 10 KB.

We are planning to deploy 10 nodes of ES. Can you recommend the optimal size of shards (5 or 10)?
Or
We will go though 20 ES nodes and create 20 shards per index.

Please suggest your view on the same.

Regards,
Ankit Jain

On Thursday, 4 July 2013 15:38:47 UTC+5:30, David Pilato wrote:
I see. Thanks for the clarification.

What I would do is:

  • test how many docs you can have in one single shard without having bad response time for your use case.
  • if IO is your concern, buy/rent instances with SSD drives
  • test how many shards you can then have on a single box until time response become unacceptable for you

Then, decide about the number of nodes, shards...

Why no replica? I mean "will you have many requests / s?" In that case, I will add replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 juil. 2013 à 11:53, Ankit Jain ankitj...@gmail.com a écrit :

Hi David,
Thanks for reply.

I am planning to deploy around 10 nodes of ES with replica=0. I would like to create index on hour basis because we are getting millions records per hour. Also, I would like to set the number of shards 10 for each index for reducing the disk IO during searching.

Suppose, If we add new nodes in future, then it would rebalance the shards, means It would increase the disk IO (multiple shards of one index on same node) and can have impact on query performance.

Regards,
Ankit

On Thursday, 4 July 2013 15:09:00 UTC+5:30, David Pilato wrote:
Yes. It's possible. But to me, it's not an issue.
BTW, Elasticsearch will ensure that replicas and primary are not on the same node for a given shard.

I'm wondering if you are trying to solve an issue you don't have but you are supposing that you will have.
If so, what is your concern?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 10:39, Ankit Jain ankitj...@gmail.com a écrit :

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then created one index, then one shard will allocate to each node. After this, I have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are running). After this, when i have started 6th node, then some shards automatically allocate to 6th node and also 6th node is serving more than one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards of one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:
If you have 5 primary shards, no replica, adding a 6th node won't do anything.

For all that questions, here is what I did in the past when I started to play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitj...@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then, ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of rebalancing on cluster. After rebalancing, Is it possible that the multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:
so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'
See Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will allocate between nodes.

I would like the allocation of shards in below ways, so that the load between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:
If you don't have replica, shards will be balanced automatically and you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks David :slight_smile:

On Thursday, 4 July 2013 17:41:46 UTC+5:30, David Pilato wrote:

As I said, I don't know. The only realistic way to get that numbers is by
testing!
It depends on many factors. One them is how long does a query must take?
It depends on your use case. I mean that if you need autocompletion
feature, you are probably after less than 100ms response time.

A common practice is to set the number of shards per node equals to the
number of cores.
So if you have 8 cores, you can set 8 shards per node. That means that
with your 10 nodes, you can have 80 shards. If you have replica=1, that
means you can define one index with 40 shards or 2 index with 20 shards
each…

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 13:44, Ankit Jain <ankitj...@gmail.com <javascript:>> a
écrit :

Hi David,

Thanks for the reply.

The number of records in one index is 40 millions and size of each record
is 10 KB.

We are planning to deploy 10 nodes of ES. Can you recommend the optimal
size of shards (5 or 10)?
Or
We will go though 20 ES nodes and create 20 shards per index.

Please suggest your view on the same.

Regards,
Ankit Jain

On Thursday, 4 July 2013 15:38:47 UTC+5:30, David Pilato wrote:

I see. Thanks for the clarification.

What I would do is:

  • test how many docs you can have in one single shard without having bad
    response time for your use case.
  • if IO is your concern, buy/rent instances with SSD drives
  • test how many shards you can then have on a single box until time
    response become unacceptable for you

Then, decide about the number of nodes, shards...

Why no replica? I mean "will you have many requests / s?" In that case, I
will add replicas.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 juil. 2013 à 11:53, Ankit Jain ankitj...@gmail.com a écrit :

Hi David,
Thanks for reply.

I am planning to deploy around 10 nodes of ES with replica=0. I would
like to create index on hour basis because we are getting millions records
per hour. Also, I would like to set the number of shards 10 for each index
for reducing the disk IO during searching.

Suppose, If we add new nodes in future, then it would rebalance the
shards, means It would increase the disk IO (multiple shards of one index
on same node) and can have impact on query performance.

Regards,
Ankit

On Thursday, 4 July 2013 15:09:00 UTC+5:30, David Pilato wrote:

Yes. It's possible. But to me, it's not an issue.
BTW, Elasticsearch will ensure that replicas and primary are not on the
same node for a given shard.

I'm wondering if you are trying to solve an issue you don't have but you
are supposing that you will have.
If so, what is your concern?

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 10:39, Ankit Jain ankitj...@gmail.com a écrit :

Thanks for the reply.

As you have mentioned in last mail, I ran the 5 nodes of ES and then
created one index, then one shard will allocate to each node. After this, I
have started 6th node, and adding a 6th node has no impact.

I tried same scenarios by creating multiple indices (only 5 nodes are
running). After this, when i have started 6th node, then some shards
automatically allocate to 6th node and also 6th node is serving more than
one shard of one index.

Conclusion:- After rebalancing, it is possible that the multiple shards
of one index will serve by one node.

Please correct me, if I am missing something.

Regards,
Ankit Jain

On Thursday, 4 July 2013 13:36:22 UTC+5:30, David Pilato wrote:

If you have 5 primary shards, no replica, adding a 6th node won't do
anything.

For all that questions, here is what I did in the past when I started
to play with ES:
It's so easy to start nodes with ES even on the same physical box.

So, download and install elasticsearch

Install head plugin
bin/plugin -install mobz/elasticsearch-head

start five nodes:
bin/elasticsearch -f -D es.node.name=Node1
bin/elasticsearch -f -D es.node.name=Node2
bin/elasticsearch -f -D es.node.name=Node3
bin/elasticsearch -f -D es.node.name=Node4
bin/elasticsearch -f -D es.node.name=Node5

Create your index:
curl -XPUT 'http://localhost:9200/my_index/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 0
}
}
}'

Open head: http://localhost:9200/_plugin/head/

See how shards are balanced.
Start a new node:
bin/elasticsearch -f -D es.node.name=Node6

Refresh head and see how it goes

Update replicas
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"number_of_replicas" : 1
}
}'

See in head how everything is balanced.
Kill one node…

In shorter terms: just play with it :wink:

Does it help?

David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:43, Ankit Jain ankitj...@gmail.com a écrit :

Yes, it make sense for me. Thanks David.

Number of ES nodes is 5 and number of shards per index is also 5. Then,
ES will automatically distributed shards in below ways.
Index1: one shard on each node.
Index2: one shard on each node.
Index3: one shard on each node

Suppose, we add the new node (node6), then what will be the impact of
rebalancing on cluster. After rebalancing, Is it possible that the
multiple shards of one index will serve by one node?

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:52:47 UTC+5:30, David Pilato wrote:

so, define a tag on each node, for example
node1: node.tag: node1
node2: node.tag: node2
...
node10: node.tag: node10

Then, when you create your index1, do it like this:

curl -XPUT localhost:9200/index1/_settings -d '{
"index.routing.allocation.include.tag" : "node1,node2,node3,node4,node5"
}'

See
Elasticsearch Platform — Find real-time answers at scale | Elastic

That said, by default, elasticsearch will try to do its best to
distribute things equally.

Make sense?

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 09:18, Ankit Jain ankitj...@gmail.com a écrit :

Thanks david.

yes, we don't have any replica.

Suppose, we have 10 nodes and number of shards per index is 5.
I would like to create multiple indices, then how the shards will
allocate between nodes.

I would like the allocation of shards in below ways, so that the load
between nodes are equally distributed.

Shards of Index1 will allocate from node1 to node5.

Shards of Index2 will allocate from node6 to node10.

Shards of Index3 will allocate from node1 to node5

Shards of Index4 will allocate from node6 to node10.

Regards,
Ankit Jain

On Thursday, 4 July 2013 12:11:18 UTC+5:30, David Pilato wrote:

If you don't have replica, shards will be balanced automatically and
you will have one shard per node.
With replica set to 1 (default), then you will have 2 shards per node.

BTW, you may want to look at:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 4 juil. 2013 à 08:29, Ankit Jain ankitj...@gmail.com a écrit :

Hi All,

I would like to handle rebalancing of shards manually.
We are planning to deploy 5 nodes of ES.
I would like to create 5 shards per index and each node would serve
exactly one shard of each index.
How we can manually distributed shard between nodes.

Regards,
Ankit

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.