Space ending in ES cluster

it2 · September 4, 2015, 9:36am

Hello guys!
I have next cluster: 1 master and 3 data nodes. Now it's look like: http://prntscr.com/8cg01m
As we can see, space is coming to end and i want to add some new nodes to cluster
I want to add the fourth node, i set it up as previous nodes, but then I don't see it joined to cluster.
Two differences - on new node I installed ES 1.7 and swap was disabled when I'm installing the system. In cluster I use ES 1.5 and swap was disabled after I installed the system.
Why my node is not connecting? In log i see

[2015-09-04 12:33:00,925][WARN ][bootstrap                ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2015-09-04 12:33:00,993][INFO ][node                     ] [data4] version[1.7.1], pid[4478], build[b88f43f/2015-07-29T09:54:16Z]
[2015-09-04 12:33:00,994][INFO ][node                     ] [data4] initializing ...
[2015-09-04 12:33:01,072][INFO ][plugins                  ] [data4] loaded [], sites []
[2015-09-04 12:33:01,109][INFO ][env                      ] [data4] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [1.8tb], net total_space [1.9tb], types [rootfs]
[2015-09-04 12:33:03,968][INFO ][node                     ] [data4] initialized
[2015-09-04 12:33:03,969][INFO ][node                     ] [data4] starting ...
[2015-09-04 12:33:04,166][INFO ][transport                ] [data4] bound_address {inet[/192.168.198.214:9300]}, publish_address {inet[/192.168.198.214:9300]}
[2015-09-04 12:33:04,196][INFO ][discovery                ] [data4] cloud/lOgGOLJ4QuOQr-IL8bPoNw

Debian 7.8.
bootstrap.mlockall: true UNCOMMENTED

magnusbaeck · September 4, 2015, 9:55am

Are you using multicast or unicast discovery? If the former, are the machines visible to each other? If the latter, is the configuration on the new node pointing to the existing nodes, or vice versa?

it2 · September 4, 2015, 10:13am

I'm not a profi in ES
Here is my config, it's the same on each node: http://pastebin.com/rpz7CvVw
I'm using multicast discovery, and nodes are in one subnet, i see telnet beetwen master and node to 9300 port in each direction.
But when I'm trying tcpdump on master, there is no traffic from newnode:9300

it2 · September 4, 2015, 10:18am

I make this test.
Telnet to master

root@logstash-node4:/etc/elasticsearch# telnet 192.168.198.200 9300
Trying 192.168.198.200...
Connected to 192.168.198.200.
Escape character is '^]'.
hello

And dump on master:

root@logstashm:~# tcpdump host 192.168.198.214 and port 9300
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
13:17:00.811829 IP 192.168.198.214.60228 > logstash.9300: Flags [S], seq 720574632, win 14600, options [mss 1460,sackOK,TS val 728584 ecr 0,nop,wscale 7], length 0
13:17:00.811869 IP logstash.9300 > 192.168.198.214.60228: Flags [S.], seq 3115479986, ack 720574633, win 14480, options [mss 1460,sackOK,TS val 688978160 ecr 728584,nop,wscale 9], length 0
13:17:00.812185 IP 192.168.198.214.60228 > logstash.9300: Flags [.], ack 1, win 115, options [nop,nop,TS val 728584 ecr 688978160], length 0
13:17:03.196674 IP 192.168.198.214.60228 > logstash.9300: Flags [P.], seq 1:8, ack 1, win 115, options [nop,nop,TS val 729180 ecr 688978160], length 7
13:17:03.196689 IP logstash.9300 > 192.168.198.214.60228: Flags [.], ack 8, win 29, options [nop,nop,TS val 688978756 ecr 729180], length 0
13:17:03.197307 IP logstash.9300 > 192.168.198.214.60228: Flags [F.], seq 1, ack 8, win 29, options [nop,nop,TS val 688978756 ecr 729180], length 0
13:17:03.197457 IP 192.168.198.214.60228 > logstash.9300: Flags [F.], seq 8, ack 2, win 115, options [nop,nop,TS val 729181 ecr 688978756], length 0
13:17:03.197473 IP logstash.9300 > 192.168.198.214.60228: Flags [.], ack 9, win 29, options [nop,nop,TS val 688978756 ecr 729181], length 0

it2 · September 4, 2015, 1:21pm

My fault. It was iptables on master. Now I connect node to cluster, but there is no shards for it
How can I manually distribute shards for new node too ? Or it will be made automatically when new indice would create?

magnusbaeck · September 5, 2015, 12:08pm

By default shards (new as well as old) will be redistributed evenly between all data nodes in the cluster. If this doesn't happen possible causes include:

Allocations have been disabled.
There's not enough disk space on one or more nodes.
Explicit shard routing has been configured.

it2 · September 7, 2015, 7:09am

Magnus,
now it's look like http://prntscr.com/8dipkv
What I need to look in my config?
This command may help me?

curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable": "all"}}'

Or I need to reroute it somehow?

it2 · September 7, 2015, 12:50pm

after i reboot master node nothing happens

Christian_Dahlqvist · September 7, 2015, 12:53pm

As it looks like all nodes in your cluster are not running the same version of Elasticsearch, the old nodes may not be able to read shards created on the new node, which prevents them from being replicated. Please ensure all nodes are running the same version of Elasticsearch.

it2 · September 7, 2015, 1:00pm

I update all nodes to 1.7.1

So as I understand, tommorow when new index will be created shards will allocate to all data nodes?

Christian_Dahlqvist · September 7, 2015, 1:06pm

As the space available on the different nodes is very unevenly distributed, the allocator may take that into account and continue allocating to the new node. If you have replicas enabled for the new indices, the cluster should however be able to allocate these. If the distribution is not to your liking, you may want to move some of the existing/old shards over to the new node by using the reroute API in order to get a more even distribution.

it2 · September 8, 2015, 9:41am

I dont understand why it allocated like this

I have the same version on all nodes.
data4 and data5 nodes are new in cluster and shards allocated only there. and do it not normal, a lot of unassigned shards.
I make

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
 "commands" : [ {
  "allocate" : {
   "index" : "logstash-2015.09.08", "shard" : 0, "node" : "data4"
  }
 } ]
}'

but it doesnt help, shard became yellow

Christian_Dahlqvist · September 8, 2015, 9:50am

The distribution still seems very uneven. Why don't you use the reroute API to move some of the shards on nodes data1, data2 and data 3 to the new nodes in order to even out the data distribution and free up disk space on these nodes?

Also, is the master node also running the same version of Elasticsearch as the data nodes?

it2 · September 8, 2015, 10:10am

Sorry for duplicating themes.
So take one index for example

Now I use this command

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
 "commands" : [ {
  "move" : {
   "index" : "logstash-2015.09.04", "shard" : 0, "from_node" : "data2", "to_node" : "data4"
  }
 } ]
}'

And my index now looks like

And I have all cluster with ES 1.7

and yes, master has the same version!

it2 · September 9, 2015, 9:10am

No one can help?
Another day, and my old nodes not used for shards?

Topic		Replies	Views
ES cluster unable to assing new shards Elasticsearch	5	586	July 6, 2017
Shards not replicating to two nodes Elasticsearch	4	982	July 6, 2017
Why shards are not averagely placed in es cluster nodes Elasticsearch	10	2940	July 5, 2017
Very weird ES Cluster state problem! Elasticsearch	8	544	July 6, 2017
Clustering problem with ES 2.x Elasticsearch	5	1277	July 5, 2017

Space ending in ES cluster

Related topics