How many nodes/shards do I need for a cluster Elasticsearch?


(Juan Díaz González) #1

I was reading about the number of nodes necesaries in Elasticsearch cluster do yo need and the recomendation is that to have one node per shard. For in this way the queries could be executed in parallel over all the nodes, acording to this link

But my question is, if I have for example a logstash server running every days I have a new index so in this way could have the same number of nodes althoug the number of indexes is increasing so in this way I will have more shards per node. So my question is if the important thing is that it will be a shard per node and in the same node it could be shards but from others indices.

Nowadays I have this configuration for the indices and I will have an index more per day:

health status index pri rep docs.count docs.deleted store.size pri.store.size
green open logstash-2015.07.23 5 1 6 0 86.5kb 43.2kb
green open logstash-2015.07.22 5 1 4 0 56.4kb 28.2kb
green open logstash-2015.07.24 5 1 1 0 16.1kb 8kb
green open .marvel-kibana 5 1 1 1 10.4kb 5.2kb
green open logstash-2015.07.13 5 1 3 0 44.6kb 22.3kb
green open logstash-2015.07.20 5 1 2 0 29.6kb 14.8kb
green open logstash-2015.07.14 5 1 23 0 208.4kb 104.2kb
green open logstash-2015.07.21 5 1 15 0 190.5kb 95.2kb
green open logstash-2015.07.27 5 1 8 0 114.9kb 57.4kb
green open logstash-2015.07.17 5 1 6 0 87.9kb 43.9kb
green open company 5 1 221613 4 40.1mb 20mb
green open logstash-2015.07.15 5 1 3 0 44.5kb 22.2kb
green open logstash-2015.07.16 5 1 6 0 86.4kb 43.2kb
green open company_history 5 1 124 0 103.7kb 49.5kb
green open logstash-2015.07.28 5 1 2 0 29.6kb 14.8kb
green open .kibana 5 1 1 0 9.1kb 4.5kb

And with this nodes:

shards disk.used disk.avail disk.total disk.percent host ip node
39 3.2gb 16.8gb 20gb 16 bc10-05 10.8.5.15 Anomaloco
39 6.4gb 80.8gb 87.3gb 7 bc10-03 10.8.5.13 Algrim the Strong
0 0b l8a 10.8.0.231 logstash-l8a-5920-4018
38 6.4gb 80.8gb 87.3gb 7 bc10-03 10.8.5.13 Harry Leland
38 3.2gb 16.8gb 20gb 16 bc10-05 10.8.5.15 Pathway
38 3.2gb 16.8gb 20gb 16 bc10-05 10.8.5.15 Hypnotia


(Mark Walkom) #2

What you have is fine. Having one shard per index on each node is ideal, but not necessary.

You just need to keep an eye on things and remove data/add nodes/add heap as resources get to capacity.


(Juan Díaz González) #3

Ok, and other question if for example I add a new node I have unasigned shards although in the elasticsearch configuration (elasticsearch.yml) I have this wrote paremeter:

index.routing.allocation.disable_allocation: false

After this I executed a script to recollate all shards:

NODE="Anomaloco"
IFS=$'\n'
for line in $(curl -s 'localhost:9200/_cat/shards' | fgrep UNASSIGNED); do
  INDEX=$(echo $line | (awk '{print $1}'))
  SHARD=$(echo $line | (awk '{print $2}'))

  curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
     "commands": [
        {
            "allocate": {
                "index": "'$INDEX'",
                "shard": '$SHARD',
                "node": "'$NODE'",
                "allow_primary": true
          }
        }
    ]
  }'
done

So in this way I have all shards recollated. And my question is if this could give me problems in a future because I want to have a structure of parent/child and the parent and the child they need to be in the same shard to the routing by parent or child works correctly.

Thanks for your quick request.


(Mark Walkom) #4

Why would you do this!

Let ES handle things, otherwise you are just asking for trouble.


(system) #5