Unable to add a new node on my elasticsearch cluster

Hi,

I have an elasticsearch cluster (v 1.7.6) with 18 nodes (16 data nodes on 4 physical hosts (CentOS)).
Each host have 128 GB of RAM and each data node have 16 GB of JVM HEAP.

I want to add a new data node on each physical host because we upgrade them to 160 GB of RAM.

So the new target configuration is 20 data nodes of 16GB on my 4 physical hosts.

But when I try to start the new node, it does not start at all and the top command freeze.

Can anyone help me ?

Logs of /var/log/message
Jun 21 11:56:00 es1 kernel: INFO: task top:27527 blocked for more than 120 seconds.
Jun 21 11:56:00 es1 kernel: Not tainted 2.6.32-504.12.2.el6.x86_64 #1
Jun 21 11:56:00 es1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 11:56:00 es1 kernel: top D 0000000000000010 0 27527 27230 0x00000080
Jun 21 11:56:00 es1 kernel: ffff8814ed51fc50 0000000000000082 0000000000000000 ffff8815d2276840
Jun 21 11:56:00 es1 kernel: ffff8803fd0f4440 ffff8815d2276840 000036ad95b560d7 ffff8814ed51fce8
Jun 21 11:56:00 es1 kernel: ffff8814ed51fbe8 00000001039064e7 ffff8819bac72638 ffff8814ed51ffd8
Jun 21 11:56:00 es1 kernel: Call Trace:
Jun 21 11:56:00 es1 kernel: [] ? dput+0x9a/0x150
Jun 21 11:56:00 es1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Jun 21 11:56:00 es1 kernel: [] rwsem_down_read_failed+0x26/0x30
Jun 21 11:56:00 es1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Jun 21 11:56:00 es1 kernel: [] ? down_read+0x24/0x30
Jun 21 11:56:00 es1 kernel: [] __access_remote_vm+0x41/0x1f0
Jun 21 11:56:00 es1 kernel: [] ? do_filp_open+0x6ea/0xd20
Jun 21 11:56:00 es1 kernel: [] access_process_vm+0x5b/0x80
Jun 21 11:56:00 es1 kernel: [] proc_pid_cmdline+0x6d/0x120
Jun 21 11:56:00 es1 kernel: [] ? alloc_pages_current+0xaa/0x110
Jun 21 11:56:00 es1 kernel: [] proc_info_read+0xad/0xf0
Jun 21 11:56:00 es1 kernel: [] vfs_read+0xb5/0x1a0
Jun 21 11:56:00 es1 kernel: [] sys_read+0x51/0x90
Jun 21 11:56:00 es1 kernel: [] ? __audit_syscall_exit+0x25e/0x290
Jun 21 11:56:00 es1 kernel: [] system_call_fastpath+0x16/0x1b
Jun 21 11:56:00 es1 kernel: INFO: task java:28206 blocked for more than 120 seconds.
Jun 21 11:56:00 es1 kernel: Not tainted 2.6.32-504.12.2.el6.x86_64 #1
Jun 21 11:56:00 es1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 11:56:00 es1 kernel: java D 0000000000000018 0 28206 1 0x00000080
Jun 21 11:56:00 es1 kernel: ffff8801387d3cf0 0000000000000086 ffff8801387d3c88 ffff8801387d3da8
Jun 21 11:56:00 es1 kernel: ffffffff810b33c6 ffff8801387d3d28 ffffc90039cefe04 0000000000000002
Jun 21 11:56:00 es1 kernel: ffff8801387d3f38 ffffffff00000000 ffff88145bd3a638 ffff8801387d3fd8
Jun 21 11:56:00 es1 kernel: Call Trace:
Jun 21 11:56:00 es1 kernel: [] ? futex_wait+0x1e6/0x310
Jun 21 11:56:00 es1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Jun 21 11:56:00 es1 kernel: [] rwsem_down_read_failed+0x26/0x30
Jun 21 11:56:00 es1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Jun 21 11:56:00 es1 kernel: [] ? down_read+0x24/0x30
Jun 21 11:56:00 es1 kernel: [] __do_page_fault+0x18e/0x480
Jun 21 11:56:00 es1 kernel: [] ? rwsem_wake+0x75/0x170
Jun 21 11:56:00 es1 kernel: [] do_page_fault+0x3e/0xa0
Jun 21 11:56:00 es1 kernel: [] page_fault+0x25/0x30
Jun 21 11:56:00 es1 kernel: INFO: task java:28207 blocked for more than 120 seconds.
Jun 21 11:56:00 es1 kernel: Not tainted 2.6.32-504.12.2.el6.x86_64 #1
Jun 21 11:56:00 es1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 11:56:00 es1 kernel: java D 0000000000000009 0 28207 1 0x00000080
Jun 21 11:56:00 es1 kernel: ffff8801387d7e18 0000000000000086 0000000000000000 ffff88150a725a00
Jun 21 11:56:00 es1 kernel: 0000000000000d28 00000000bcd1ee27 000036adb9f354bf 0000000000000000
Jun 21 11:56:00 es1 kernel: 0000000000000001 0000000103906760 ffff8801387d5af8 ffff8801387d7fd8
Jun 21 11:56:00 es1 kernel: Call Trace:
Jun 21 11:56:00 es1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Jun 21 11:56:00 es1 kernel: [] rwsem_down_write_failed+0x23/0x30
Jun 21 11:56:00 es1 kernel: [] call_rwsem_down_write_failed+0x13/0x20
Jun 21 11:56:00 es1 kernel: [] ? down_write+0x32/0x40
Jun 21 11:56:00 es1 kernel: [] sys_mprotect+0xe6/0x250
Jun 21 11:56:00 es1 kernel: [] ? __audit_syscall_exit+0x25e/0x290
Jun 21 11:56:00 es1 kernel: [] system_call_fastpath+0x16/0x1b
Jun 21 12:22:52 es1 kernel: possible SYN flooding on port 9305. Sending cookies.

What is in the Elasticsearch logs?

Unfortunately, there is nothing in the Elasticsearch logs.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.