Space ending in ES cluster

(IT2) #1

Hello guys!
I have next cluster: 1 master and 3 data nodes. Now it's look like:
As we can see, space is coming to end and i want to add some new nodes to cluster
I want to add the fourth node, i set it up as previous nodes, but then I don't see it joined to cluster.
Two differences - on new node I installed ES 1.7 and swap was disabled when I'm installing the system. In cluster I use ES 1.5 and swap was disabled after I installed the system.
Why my node is not connecting? In log i see

[2015-09-04 12:33:00,925][WARN ][bootstrap                ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-09-04 12:33:00,993][INFO ][node                     ] [data4] version[1.7.1], pid[4478], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-04 12:33:00,994][INFO ][node                     ] [data4] initializing ...
[2015-09-04 12:33:01,072][INFO ][plugins                  ] [data4] loaded [], sites []
[2015-09-04 12:33:01,109][INFO ][env                      ] [data4] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [1.8tb], net total_space [1.9tb], types [rootfs]
[2015-09-04 12:33:03,968][INFO ][node                     ] [data4] initialized
[2015-09-04 12:33:03,969][INFO ][node                     ] [data4] starting ...
[2015-09-04 12:33:04,166][INFO ][transport                ] [data4] bound_address {inet[/]}, publish_address {inet[/]}
[2015-09-04 12:33:04,196][INFO ][discovery                ] [data4] cloud/lOgGOLJ4QuOQr-IL8bPoNw

Debian 7.8.
bootstrap.mlockall: true UNCOMMENTED

Shards locating on one node
(Magnus Bäck) #2

Are you using multicast or unicast discovery? If the former, are the machines visible to each other? If the latter, is the configuration on the new node pointing to the existing nodes, or vice versa?

(IT2) #3

I'm not a profi in ES
Here is my config, it's the same on each node:
I'm using multicast discovery, and nodes are in one subnet, i see telnet beetwen master and node to 9300 port in each direction.
But when I'm trying tcpdump on master, there is no traffic from newnode:9300

(IT2) #4

I make this test.
Telnet to master

root@logstash-node4:/etc/elasticsearch# telnet 9300
Connected to
Escape character is '^]'.

And dump on master:

root@logstashm:~# tcpdump host and port 9300
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
13:17:00.811829 IP > logstash.9300: Flags [S], seq 720574632, win 14600, options [mss 1460,sackOK,TS val 728584 ecr 0,nop,wscale 7], length 0
13:17:00.811869 IP logstash.9300 > Flags [S.], seq 3115479986, ack 720574633, win 14480, options [mss 1460,sackOK,TS val 688978160 ecr 728584,nop,wscale 9], length 0
13:17:00.812185 IP > logstash.9300: Flags [.], ack 1, win 115, options [nop,nop,TS val 728584 ecr 688978160], length 0
13:17:03.196674 IP > logstash.9300: Flags [P.], seq 1:8, ack 1, win 115, options [nop,nop,TS val 729180 ecr 688978160], length 7
13:17:03.196689 IP logstash.9300 > Flags [.], ack 8, win 29, options [nop,nop,TS val 688978756 ecr 729180], length 0
13:17:03.197307 IP logstash.9300 > Flags [F.], seq 1, ack 8, win 29, options [nop,nop,TS val 688978756 ecr 729180], length 0
13:17:03.197457 IP > logstash.9300: Flags [F.], seq 8, ack 2, win 115, options [nop,nop,TS val 729181 ecr 688978756], length 0
13:17:03.197473 IP logstash.9300 > Flags [.], ack 9, win 29, options [nop,nop,TS val 688978756 ecr 729181], length 0

(IT2) #5

My fault. It was iptables on master. Now I connect node to cluster, but there is no shards for it
How can I manually distribute shards for new node too ? Or it will be made automatically when new indice would create?

(Magnus Bäck) #6

By default shards (new as well as old) will be redistributed evenly between all data nodes in the cluster. If this doesn't happen possible causes include:

  • Allocations have been disabled.
  • There's not enough disk space on one or more nodes.
  • Explicit shard routing has been configured.

(IT2) #7

now it's look like
What I need to look in my config?
This command may help me?

curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable": "all"}}'

Or I need to reroute it somehow?

(IT2) #8

after i reboot master node nothing happens

(Christian Dahlqvist) #9

As it looks like all nodes in your cluster are not running the same version of Elasticsearch, the old nodes may not be able to read shards created on the new node, which prevents them from being replicated. Please ensure all nodes are running the same version of Elasticsearch.

(IT2) #10

I update all nodes to 1.7.1

So as I understand, tommorow when new index will be created shards will allocate to all data nodes?

(Christian Dahlqvist) #11

As the space available on the different nodes is very unevenly distributed, the allocator may take that into account and continue allocating to the new node. If you have replicas enabled for the new indices, the cluster should however be able to allocate these. If the distribution is not to your liking, you may want to move some of the existing/old shards over to the new node by using the reroute API in order to get a more even distribution.

(IT2) #12

I dont understand why it allocated like this

I have the same version on all nodes.
data4 and data5 nodes are new in cluster and shards allocated only there. and do it not normal, a lot of unassigned shards.
I make

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
 "commands" : [ {
  "allocate" : {
   "index" : "logstash-2015.09.08", "shard" : 0, "node" : "data4"
 } ]

but it doesnt help, shard became yellow :frowning:

(Christian Dahlqvist) #13

The distribution still seems very uneven. Why don't you use the reroute API to move some of the shards on nodes data1, data2 and data 3 to the new nodes in order to even out the data distribution and free up disk space on these nodes?

Also, is the master node also running the same version of Elasticsearch as the data nodes?

(IT2) #14

Sorry for duplicating themes.
So take one index for example

Now I use this command

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
 "commands" : [ {
  "move" : {
   "index" : "logstash-2015.09.04", "shard" : 0, "from_node" : "data2", "to_node" : "data4"
 } ]

And my index now looks like

And I have all cluster with ES 1.7

and yes, master has the same version!

(IT2) #15

No one can help?
Another day, and my old nodes not used for shards?

(system) #16