I am already running ELK stack with Elasticsearch(ES) 1.7 with docker container with 3 nodes, each running one ES container, running behind nginx
server. Now I am trying to upgrade ES to 2.4.0. Root user is not allowed in ES 2.4.0 so I am using -Des.root.insecure.allow=true
option.
Configuration file will be modified as follows:
#Performance optimization settings
echo "index.number_of_replicas: 1" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "index.number_of_shards: 3" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "discovery.zen.ping.multicast.enabled: false" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "bootstrap.mlockall: true" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "indices.memory.index_buffer_size: 50%" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#publish host as container host address
#echo "network.publish_host: ${CONTAINER_HOST_ADDRESS}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.bind_host: ${CONTAINER_HOST_ADDRESS}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.publish_host: ${CONTAINER_PRIVATE_IP}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.bind_host: ${CONTAINER_PRIVATE_IP}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "network.host: ${CONTAINER_HOST_ADDRESS}" >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "network.host: 0.0.0.0" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "htpp.port: 9200" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#echo "transport.tcp.port: 9300-9400" >> ${ES_CONFIG_PATH}/elasticsearch.yml
#configure elasticsearch.yml for clustering
echo 'discovery.zen.ping.unicast.hosts: [ELASTICSEARCH_IPS] ' >> ${ES_CONFIG_PATH}/elasticsearch.yml
echo "discovery.zen.minimum_master_nodes: 1" >> ${ES_CONFIG_PATH}/elasticsearch.yml
ELASTICSEARCH_IPS
is array of IPs of other nodes, which is obtained by all nodes running a script called query-crs-es.sh
. Eventually Array will have IPs of other two nodes of cluster. Please note they will be node's IP, not container private IPs.
When ever I try to run the container I use ansible
. During start up, all nodes get up but failed to form cluster. I consistently get these error
Node 1 starts withour any problem, gets elected as master, name is Dragon Lord.
Node2:
[2016-10-07 09:45:58,561][WARN ][bootstrap ] running as ROOT user. this is a bad idea!
[2016-10-07 09:45:58,729][INFO ][node ] [Defensor] version[2.4.0], pid[1], build[ce9f0c7/2016-08-29T09:14:17Z]
[2016-10-07 09:45:58,729][INFO ][node ] [Defensor] initializing ...
[2016-10-07 09:45:59,215][INFO ][plugins ] [Defensor] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2016-10-07 09:45:59,237][INFO ][env ] [Defensor] using [1] data paths, mounts [[/data (/dev/mapper/platform-data)]], net usable_space [2.5tb], net total_space [2.5tb], spins? [possibly], types [xfs]
[2016-10-07 09:45:59,237][INFO ][env ] [Defensor] heap size [989.8mb], compressed ordinary object pointers [true]
[2016-10-07 09:45:59,266][WARN ][threadpool ] [Defensor] requested thread pool size [60] for [index] is too large; setting to maximum [32] instead
[2016-10-07 09:46:00,733][INFO ][node ] [Defensor] initialized
[2016-10-07 09:46:00,733][INFO ][node ] [Defensor] starting ...
[2016-10-07 09:46:00,833][INFO ][transport ] [Defensor] publish_address {172.17.0.16:9300}, bound_addresses {[::]:9300}
[2016-10-07 09:46:00,837][INFO ][discovery ] [Defensor] ccs-elasticsearch/RXALMe9NQVmbCz5gg1CwHA
[2016-10-07 09:46:03,876][WARN ][discovery.zen ] [Defensor] failed to connect to master [{Dragon Lord}{5wNwWJRFRS-2dRY5AGqqGQ}{172.17.0.15}{172.17.0.15:9300}], retrying...
ConnectTransportException[[Dragon Lord][172.17.0.15:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /172.17.0.15:9300];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:1002)
at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:937)
Caused by: java.net.ConnectException: Connection refused: /172.17.0.15:9300
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
Node3 have similar logs.
As you can see from logs, Node 2 and 3 are aware of master, Node1, but unable to connect. I have tried most of the configurations about network.host
which you can see commented in configuration code and neither of them work.
Any leads will be appreciated.