Elasticsearch cluster is not working

jatinseth2007 · November 15, 2018, 12:28pm

I am trying to run ES(6.4.3) cluster where I have 2 nodes and I want to make one node as master and data and other as only data. My both nodes are up and running but not able to communicate with each other to discover a cluster.

If I run telnet command to check then I am able to run telnet to localhost on port 9300 but not able to connect to other node.

I am using Digital ocean and ubuntu 16.04

warkolm · November 16, 2018, 12:04am

That's less than ideal, see Important Configuration Changes | Elasticsearch: The Definitive Guide [2.x] | Elastic.

You will need to show your config and your logs for us to really help.

Please format your code/config/logs using the </> button, or markdown style back ticks. It helps to make things easy to read which helps us help you

Tek_Chand · November 16, 2018, 4:11am

@jatinseth2007, port 9300 should be open on both the nodes. Please check the elasticsearch logs for more info.

Thanks.

jatinseth2007 · November 16, 2018, 7:24am

Config for one server:

# ======================== Elasticsearch Configuration =========================

#

# NOTE: Elasticsearch comes with reasonable defaults for most settings.

# Before you set out to tweak and tune the configuration, make sure you

# understand what are you trying to accomplish and the consequences.

#

# The primary way of configuring a node is via this file. This template lists

# the most important settings you may want to configure for a production cluster.

#

# Please consult the documentation for further information on configuration options:

# https://www.elastic.co/guide/en/elasticsearch/reference/index.html

#

# ---------------------------------- Cluster -----------------------------------

#

# Use a descriptive name for your cluster:

#

cluster.name : campaygn-production

#

# ------------------------------------ Node ------------------------------------

#

# Use a descriptive name for the node:

#

node.name: campaygn-production-elasticsearch-node-0

#

# Add custom attributes to the node:

#

#node.attr.rack: r1

#

# ----------------------------------- Paths ------------------------------------

#

# Path to directory where to store the data (separate multiple locations by comma):

#

path.data: /var/lib/elasticsearch

#

# Path to log files:

#

path.logs: /var/log/elasticsearch

#

# ----------------------------------- Memory -----------------------------------

#

# Lock the memory on startup:

#

#bootstrap.memory_lock: true

#

# Make sure that the heap size is set to about half the memory available

# on the system and that the owner of the process is allowed to use this

# limit.

#

# Elasticsearch performs poorly when the system is swapping the memory.

#

# ---------------------------------- Network -----------------------------------

#

# Set the bind address to a specific IP (IPv4 or IPv6):

#

network.host: 209.97.184.107

#

# Set a custom port for HTTP:

#

http.port: 9200

#

# For more information, consult the network module documentation.

#

# --------------------------------- Discovery ----------------------------------

#

# Pass an initial list of hosts to perform discovery when new node is started:

# The default list of hosts is ["127.0.0.1", "[::1]"]

#

discovery.zen.ping.unicast.hosts: ["209.97.184.107","142.93.42.161"]

#

# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):

#

#discovery.zen.minimum_master_nodes: 

#

# For more information, consult the zen discovery module documentation.

#

# ---------------------------------- Gateway -----------------------------------

#

# Block initial recovery after a full cluster restart until N nodes are started:

#

#gateway.recover_after_nodes: 3

#

# For more information, consult the gateway module documentation.

#

# ---------------------------------- Various -----------------------------------

#

# Require explicit names when deleting indices:

#

#action.destructive_requires_name: true

#script.inline: true

indices.query.bool.max_clause_count: 100000

#path.repo: ["/tmp/backups/elasticsearch/elasticsearch"]

#define node 1 as master-eligible:

node.master: true

#define nodes 1 as data node:

node.data: true

and logs for same server:

[2018-03-16T10:57:38,250][INFO ][o.e.t.TransportService   ] [lhD23sr] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
    [2018-03-16T10:57:41,309][INFO ][o.e.c.s.MasterService    ] [lhD23sr] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {lhD23sr}{lhD23srPT3Wb0Kug2XmkiA}{1VfK_uzZRayGdEX4sABdyQ}{127.0.0.1}{127.0.0.1:9300}
    [2018-03-16T10:57:41,314][INFO ][o.e.c.s.ClusterApplierService] [lhD23sr] new_master {lhD23sr}{lhD23srPT3Wb0Kug2XmkiA}{1VfK_uzZRayGdEX4sABdyQ}{127.0.0.1}{127.0.0.1:9300}, reason: apply cluster state (from master [master {lhD23sr}{lhD23srPT3Wb0Kug2XmkiA}{1VfK_uzZRayGdEX4sABdyQ}{127.0.0.1}{127.0.0.1:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
    [2018-03-16T10:57:41,341][INFO ][o.e.h.n.Netty4HttpServerTransport] [lhD23sr] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
    [2018-03-16T10:57:41,341][INFO ][o.e.n.Node               ] [lhD23sr] started
    [2018-03-16T10:57:41,347][INFO ][o.e.g.GatewayService     ] [lhD23sr] recovered [0] indices into cluster_state
    [2018-03-16T11:07:58,242][INFO ][o.e.n.Node               ] [lhD23sr] stopping ...
    [2018-03-16T11:07:58,270][INFO ][o.e.n.Node               ] [lhD23sr] stopped
    [2018-03-16T11:07:58,270][INFO ][o.e.n.Node               ] [lhD23sr] closing ...
    [2018-03-16T11:07:58,278][INFO ][o.e.n.Node               ] [lhD23sr] closed

jatinseth2007 · November 16, 2018, 7:26am

Config for 2nd server :

# ======================== Elasticsearch Configuration =========================

#

# NOTE: Elasticsearch comes with reasonable defaults for most settings.

# Before you set out to tweak and tune the configuration, make sure you

# understand what are you trying to accomplish and the consequences.

#

# The primary way of configuring a node is via this file. This template lists

# the most important settings you may want to configure for a production cluster.

#

# Please consult the documentation for further information on configuration options:

# https://www.elastic.co/guide/en/elasticsearch/reference/index.html

#

# ---------------------------------- Cluster -----------------------------------

#

# Use a descriptive name for your cluster:

#

cluster.name: campaygn-production

#

# ------------------------------------ Node ------------------------------------

#

# Use a descriptive name for the node:

#

node.name: campaygn-production-elasticsearch-node-1

#

# Add custom attributes to the node:

#

#node.attr.rack: r1

#

# ----------------------------------- Paths ------------------------------------

#

# Path to directory where to store the data (separate multiple locations by comma):

#

path.data: /var/lib/elasticsearch

#

# Path to log files:

#

path.logs: /var/log/elasticsearch

#

# ----------------------------------- Memory -----------------------------------

#

# Lock the memory on startup:

#

#bootstrap.memory_lock: true

#

# Make sure that the heap size is set to about half the memory available

# on the system and that the owner of the process is allowed to use this

# limit.

#

# Elasticsearch performs poorly when the system is swapping the memory.

#

# ---------------------------------- Network -----------------------------------

#

# Set the bind address to a specific IP (IPv4 or IPv6):

#

network.host: 142.93.42.161

#

# Set a custom port for HTTP:

#

http.port: 9200

#

# For more information, consult the network module documentation.

#

# --------------------------------- Discovery ----------------------------------

#

# Pass an initial list of hosts to perform discovery when new node is started:

# The default list of hosts is ["127.0.0.1", "[::1]"]

#

discovery.zen.ping.unicast.hosts: ["209.97.184.107","142.93.42.161"]

#

# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):

#

#discovery.zen.minimum_master_nodes: 

#

# For more information, consult the zen discovery module documentation.

#

# ---------------------------------- Gateway -----------------------------------

#

# Block initial recovery after a full cluster restart until N nodes are started:

#

#gateway.recover_after_nodes: 3

#

# For more information, consult the gateway module documentation.

#

# ---------------------------------- Various -----------------------------------

#

# Require explicit names when deleting indices:

#

#action.destructive_requires_name: true

indices.query.bool.max_clause_count: 100000

#define node 1 as master-eligible:

node.master: false

#define nodes 1 as data node:

node.data: true

And logs for 2nd server:

[2018-11-14T10:46:57,186][INFO ][o.e.n.Node               ] [q3IrN9m] starting ...
[2018-11-14T10:46:57,354][INFO ][o.e.t.TransportService   ] [q3IrN9m] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2018-11-14T10:47:00,413][INFO ][o.e.c.s.ClusterService   ] [q3IrN9m] new_master {q3IrN9m}{q3IrN9mDQrS8TkwchxdqUw}{o9D1RMJBTdWXBEzJlvF-nQ}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2018-11-14T10:47:00,448][INFO ][o.e.h.n.Netty4HttpServerTransport] [q3IrN9m] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2018-11-14T10:47:00,448][INFO ][o.e.n.Node               ] [q3IrN9m] started
[2018-11-14T10:47:00,450][INFO ][o.e.g.GatewayService     ] [q3IrN9m] recovered [0] indices into cluster_state
[2018-11-14T10:57:50,942][INFO ][o.e.n.Node               ] [q3IrN9m] stopping ...
[2018-11-14T10:57:50,965][INFO ][o.e.n.Node               ] [q3IrN9m] stopped
[2018-11-14T10:57:50,965][INFO ][o.e.n.Node               ] [q3IrN9m] closing ...
[2018-11-14T10:57:50,974][INFO ][o.e.n.Node               ] [q3IrN9m] closed

jatinseth2007 · November 16, 2018, 7:28am

Regarding https://www.elastic.co/guide/en/elasticsearch/guide/2.x/important-configuration-changes.html#_minimum_master_nodes.
I am aware of the guidelines and will add one more once this will be successful.

warkolm · November 16, 2018, 7:29am

Both are still only binding to localhost for some reason.

However you shouldn't really use a public IP for your cluster, unless you have some kind of access control managing things.

jatinseth2007 · November 16, 2018, 7:30am

Regarding ports my ufw status is given below:
for one server
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere                  
Nginx Full                 ALLOW       Anywhere                  
3306                       ALLOW       Anywhere                  
9300                       ALLOW       Anywhere                  
OpenSSH (v6)               ALLOW       Anywhere (v6)             
Nginx Full (v6)            ALLOW       Anywhere (v6)             
3306 (v6)                  ALLOW       Anywhere (v6)             
9300 (v6)                  ALLOW       Anywhere (v6)

for 2nd server

Status: active

To                         Action      From
--                         ------      ----
9300                       ALLOW       Anywhere                  
9200                       ALLOW       Anywhere                  
22                         ALLOW       Anywhere                  
80                         ALLOW       Anywhere                  
23/tcp                     ALLOW       Anywhere                  
9300 (v6)                  ALLOW       Anywhere (v6)             
9200 (v6)                  ALLOW       Anywhere (v6)             
22 (v6)                    ALLOW       Anywhere (v6)             
80 (v6)                    ALLOW       Anywhere (v6)             
23/tcp (v6)                ALLOW       Anywhere (v6)

jatinseth2007 · November 16, 2018, 7:32am

I can use private IP too, once it will start working.

So, what I should put as a setting ? I have given network host as mentioned in one tutorial on digital ocean.

When I am trying to send request from second server curl -XGET 'http://142.93.42.161:9200/_cluster/state?pretty' following error is coming.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "master_not_discovered_exception",
        "reason" : null
      }
    ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

Tek_Chand · November 16, 2018, 9:51am

@jatinseth2007, Can you please define the below setting in your master node in elasticsearch.yml file and restart the elasticsearch service:

node.master: true
node.data: false

just now read your configuration you have defined above config. Can you please define below setting into elasticsearch.yml

discovery.zen.minimum_master_nodes: 2

Also please provide the output of below command from both servers:

netstat -tunlp

Thanks.

jatinseth2007 · November 22, 2018, 11:52am

I found out the problem and you were correct, it was permission issue but at server the permissions were fine but for Digital Ocean firewall they have one extra firewall and if you don't give any permission then by default it's not allowed. So, I had to give permissions specifically for those ports to each server.

Just one question I have is how to make sure cluster is working good, I have run following curl request and response is also given:

curl -XGET -H "Content-Type: application/json" http://localhost:9200/_cluster/health?pretty=true

{
  "cluster_name" : "production",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 10,
  "active_shards" : 11,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 9,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 55.00000000000001
}

Tek_Chand · November 23, 2018, 4:18am

@jatinseth2007,

Glad to hear that.

If you see the output of your curl command, the cluster health is showing yellow. It should be green. Currently its showing status yellow because there are 9 shards which are unassigned. The reason for unassigned shards is, you have only one data node. So your primary shard and replica shards are on same data node. So you need at least 2 data node. So your primary shard and replica shards can be on different data node. Then the cluster status will also changed and it will become green.

Let say, you have 5 primary shards (P1,P2....p5) and 1 replica for each primary shards. Then you will have total 10 shards 5 primary and 5 replicas i.e R1,R2,...R5 (one for each primary shards).

If you have 2 data node then your first data node may have P1,P2,R3,R4,R5 and second data node may have P3,P4,P5,R1,R2. It managed by Elasticsearch itself.

Hope so above point will help you.

Kindly let me know if you any other question.

Thanks.

system · December 21, 2018, 4:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Not able to create a cluster in Elasticsearch Elasticsearch	2	693	June 28, 2019
Cluster (connect 2 nodes) Elasticsearch	52	2435	July 3, 2018
Not able to start data node for ES cluster Elasticsearch	1	443	November 14, 2019
Elasticsearch nodes on same machine unable to find each other Elasticsearch	4	1273	June 22, 2017
Set up elasticsearch cluster for 2 machines Elasticsearch	2	1595	August 29, 2018

Elasticsearch cluster is not working

Related topics