Is my data really replicated across the cluster? How to know?

pszm · June 17, 2020, 3:14am

We used to have just a single node in our cluster. In the past 2 weeks, we added 2 more nodes. Now we want to switch off the old node because the machine is old. But I am doubtful that all our data has actually been spread out to the other machines in the cluster.
I can't find a post that explains how to confirm this.

I believe that doing a _cat/shards will tell me. Here is the output from that:

$ curl -XGET 'localhost:9200/_cat/shards/our_index_1,our_index_2,our_index_3?pretty'

our_index_1 1 r STARTED 28380731 17.3gb machine-02-ip elasticsearch-02
our_index_1 1 p STARTED 28380731 17.3gb machine-03-ip elasticsearch-03
our_index_1 3 r STARTED 28378851   17gb machine-03-ip elasticsearch-03
our_index_1 3 p STARTED 28378848   17gb machine-01-ip elasticsearch-01
our_index_1 2 r STARTED 28385815 17.5gb machine-03-ip elasticsearch-03
our_index_1 2 p STARTED 28385815 17.4gb machine-01-ip elasticsearch-01
our_index_1 4 r STARTED 28370118   17gb machine-02-ip elasticsearch-02
our_index_1 4 p STARTED 28370114 16.9gb machine-03-ip elasticsearch-03
our_index_1 0 r STARTED 28378628 16.9gb machine-02-ip elasticsearch-02
our_index_1 0 p STARTED 28378628 16.9gb machine-01-ip elasticsearch-01

our_index_2 0 p STARTED  2339117  1.4gb machine-03-ip elasticsearch-03
our_index_2 0 r STARTED  2341647  1.5gb machine-01-ip elasticsearch-01

our_index_3 1 r STARTED     1928    8mb machine-03-ip elasticsearch-03
our_index_3 1 p STARTED     1928  8.5mb machine-01-ip elasticsearch-01
our_index_3 3 r STARTED     1965  6.7mb machine-03-ip elasticsearch-03
our_index_3 3 p STARTED     1965  7.5mb machine-01-ip elasticsearch-01
our_index_3 2 r STARTED     2011    7mb machine-02-ip elasticsearch-02
our_index_3 2 p STARTED     2011  8.5mb machine-03-ip elasticsearch-03
our_index_3 4 p STARTED     2049  6.5mb machine-02-ip elasticsearch-02
our_index_3 4 r STARTED     2049  6.6mb machine-03-ip elasticsearch-03
our_index_3 0 r STARTED     1956  7.7mb machine-02-ip elasticsearch-02
our_index_3 0 p STARTED     1956  9.2mb machine-01-ip elasticsearch-01

It looks like all shards are on 2 machines, but no shards are on 3. So in theory if I turn off machine 1 all data will only be on 1 machine (either machine 2 or machine 3).
Is this correct? I am not sure it is what we want. Is this a configuration issue?

warkolm · June 17, 2020, 5:12am

There's a heap on the 3rd node, here's just some of them.;

our_index_1 1 p STARTED 28380731 17.3gb machine-03-ip elasticsearch-03
our_index_1 3 r STARTED 28378851   17gb machine-03-ip elasticsearch-03
our_index_1 2 r STARTED 28385815 17.5gb machine-03-ip elasticsearch-03

If you cluster is green, and every index has a single replica set, then you are fine.

pszm · June 17, 2020, 5:55am

Thanks. I guess I was expecting all shards to have 3 entries/copies - 1 on each machine. But it seems that it is done just with 2 copies.

warkolm · June 17, 2020, 5:57am

The primary, plus N replicas. In your case N =1, which is the default.

If you want every node to have a copy then increase the replica count to 2. Then there will be 3 copies, the primary plus 2 replicas.

pszm · June 17, 2020, 6:00am

Ah! Thanks!

pszm · June 17, 2020, 6:17am

To follow up on this. I thought that to 'turn off' one of the nodes (machine-01, which happened to be the master node) I simply needed to change cluster.initial_master_nodes and discover.seed_hosts on machine-02 to be itself as opposed to being machine-01.
machine-03 already has machine-02 as the initial master and seed host.

But when I do that, machine-02 is no longer part of the cluster:

$ curl -XGET 'localhost:9200/?pretty'
{
  "name" : "elasticsearch-02",
  "cluster_name" : "zm-amz-data",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "7.7.1",
    "build_flavor" : "oss",
    "build_type" : "deb",
    "build_hash" : "ad56dce891c901a492bb1ee393f12dfff473a423",
    "build_date" : "2020-05-28T16:30:01.040088Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

machine-02 elasticsearch.yml:

discovery.seed_hosts: ["machine-02-ip"]
cluster.initial_master_nodes: ["elasticsearch-02"]

machine-03 elasticsearch.yml:

discovery.seed_hosts: ["machine-02-ip"]
cluster.initial_master_nodes: ["elasticsearch-02"]

defalt · June 17, 2020, 7:20am

It is no longer part of the cluster because it doesnt find the other nodes i guess. Because

discovery.seed_hosts: ["machine-02-ip"]

leads to the problem that it only knows its own IP and doesnt know where all the other nodes are at.

You should feed all IP's into discovery.seed_hosts:

discovery.seed_hosts: ["machine-01-ip","machine-02-ip","machine-03-ip"]

For a small cluster its finde to set all nodes to node.master: true and register them with cluster.initial_master_node. By this they become master eligible which helpes with turning off and on nodes. This is the setup that works best for me.

DavidTurner · June 17, 2020, 7:32am

You should not be changing cluster.initial_master_nodes after the cluster has formed. Simply remove it entirely from the config file. Quoting the docs:

You should not use this setting when restarting a cluster or adding a new node to an existing cluster.

system · July 15, 2020, 7:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Clarifications regarding # of shards / shard replication Elasticsearch	2	517	July 6, 2017
Shards not replicating to two nodes Elasticsearch	4	990	July 6, 2017
Replication of cluster data Elasticsearch	2	295	May 20, 2019
Two node cluster with all primary shards in Node 1 and all replica shards in Node 2 Elasticsearch elastic-stack-monitoring	8	2615	September 10, 2020
How Shards and Replicas distributed on the cluster Elasticsearch	3	878	December 14, 2018

Is my data really replicated across the cluster? How to know?

Related topics