We are losing data in elastic search cluster. v2.4.0


(Mustafa Halil Yildiz) #1

We made a poc with ElasticSearch but while doing it, we lost data in clustered enviroment. We used ES 2.4.0.

Can anyone say what we are missing?

Our scenario is:

1- Open Elastic Server-1 and Server-2 with the configurations below, they are in a cluster.
Our Configurations:
*** Server-1 **** cluster.name: ESCluster node.master: true node.name: "es1" node.data: true network.bind_host: ["127.0.0.1","20.20.20.5"] network.publish_host: "20.20.20.5" discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["20.20.20.5","20.20.20.6"] discovery.zen.minimum_master_nodes: 1

*** Server-2 **** cluster.name: ESCluster node.master: true node.name: "es2" node.data: true network.bind_host: ["127.0.0.1","20.20.20.6"] network.publish_host: "20.20.20.6" discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["20.20.20.5","20.20.20.6"] discovery.zen.minimum_master_nodes: 1

2- Index document over Server-1:

curl -XPUT '20.20.20.5:9200/ert/post/1' -d ' { "user": "easlan", "postDate": "01-16-2015", "body": "Adding Data in ElasticSearch Cluster" , "title": "ElasticSearch Cluster Test - 1" }'

3- Look for indexed docs over Server-1 or Server-2:Total number of results is 1(as expected):

curl -XGET '20.20.20.5:9200/ert/post/_search?q=user:easlan&pretty=true' curl -XGET '20.20.20.6:9200/ert/post/_search?q=user:easlan&pretty=true'

4- Then close Server-1

5- Index new document over Server-2:

curl -XPUT '20.20.20.6:9200/ert/post/2' -d ' { "user": "easlan", "postDate": "01-16-2015", "body": "Adding Data in ElasticSearch Cluster" , "title": "ElasticSearch Cluster Test - 2" }'

6- Look for indexed docs over Server-2:Total number of results is 2:

curl -XGET '20.20.20.6:9200/ert/post/_search?q=user:easlan&pretty=true'

7- Close Server-2

8- Open Server-1

9- Look for indexed docs over Server-1:Total number of results is 1 (as expected, because server-2 is closed):

curl -XGET '20.20.20.5:9200/ert/post/_search?q=user:easlan&pretty=true'

10- Then open Server-2 again. Look for indexed docs over Server-1 or Server-2. We expect to see total number of results as 2 but when we look, we got 1 as a result. Even we restart two of them again the result is still 1:

curl -XGET '20.20.20.5:9200/ert/post/_search?q=user:easlan&pretty=true' curl -XGET '20.20.20.6:9200/ert/post/_search?q=user:easlan&pretty=true'

my_server_ip:9200/_nodes/stats result is at here


(Christian Dahlqvist) #2

As you have 2 master eligible nodes, you must set minimum_master_nodes to 2, not 1 as in your config. minimum_master_nodes should always be set to ceiling((N/2) + 1). This setting will prevent writes as long as both nodes are not available in order to avoid data loss. If you want a cluster that can continue taking writes even with one node down, you need at least 3 master eligible nodes. You can read more about this here.


(Mark Walkom) #3

Consistency is not magic.

As @Christian_Dahlqvist mentions you have bad quorum. At step 10 which node is the master?


(Mustafa Halil Yildiz) #4

Hi @Christian_Dahlqvist, thanks for your reply.
We understood that we were thinking a little different.
This solves our problem.


(Mustafa Halil Yildiz) #5

hi @warkolm , server 1 is the master at step 10.


(system) #6