Вопрос по дата нодам

добрый день, я в данной теме новенький и хочу уточнить пару моментов:

  1. правильно ли я понимаю что если у меня есть 2 дата ноды в кластере и каждый индекс имеем 1 праймери шард и один реплика шард то при выходе из строя любой дата ноды потери данных не произойдет ? т.к. у нас как бы "зеркалирование" данных:
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 182,
  "active_shards" : 364,
  1. можно ли добиться распределения данных по дата нодам не дублируя их? Т.е. более важна скорость работы кластера чем сохранность данных. Если говорить про 3 дата ноды, то вместо 1.5Т данных на каждой дата ноде иметь по 500G на каждой и при выход любой ноды 1/3 часть данных просто не будет отображаться в кибане, но на работе кластера не отразиться.

Hello @freeman999 , it would be really easy for us to understand the issue if it's written in English.
Nevertheless, for your questions:

  1. Yes, your understanding is correct. In case of node failure, replica shard will be marked as primary shard.
  2. If you don't want replica shards, you can always update your index template or log shipper and define number of primary shards = number of data nodes and replica_shard_count=0.
1 Like

@Ayush_Mathur Thank you so much for you answers !!!
Please tell me where I can read about how to update my index template or log shipper and how to define the variables you've written about ?
Thank you again)

@freeman999 you can read about statuc and dynamic index template settings here: Index modules | Elasticsearch Guide [8.6] | Elastic

For log shipper, let's say filebeat, the index settings can be found here: Configure Elasticsearch index template loading | Filebeat Reference [8.6] | Elastic

1 Like

Hello, @Ayush_Mathur
Thank you for your help!

Speaking of log shipper, we're using fluent-bit.

curl http://192.168.101.111:9200/logstash-2023.01.25/_settings?pretty
{
  "logstash-2023.01.25" : {
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "1",
        "provided_name" : "logstash-2023.01.25",
        "creation_date" : "1674645824236",
        "number_of_replicas" : "1",
        "uuid" : "GcluE-DyRBGoMdXB-D7u9w",
        "version" : {
          "created" : "7100199"
        }
      }
    }
  }
}

And it seems fluent-bit does not have index template where we can change number_of_shards and number_of_replicas. Should I create own Elastic Search index template ?

Yes, in case of fluentd and fluentd-bit, you cannot define template and index settings unfortunately.
In this case, create an index template in Kibana (essentially stored and followed by ES) where you can specify index settings.

1 Like

@Ayush_Mathur

I've sent PUT to my ES:

192.168.101.111:9200/_index_template/test_1
{
  "index_patterns" : ["logstash*"],
  "priority" : 0,
  "template": {
    "settings" : {
      "number_of_shards" : 2,
      "number_of_replicas": 0
    },
    "mappings" : {
      "_source" : { "enabled" : false }
    }
  }
}

I've restarted my fluent-bin agent and new indexes look like:

{
    "logstash-2023.01.25": {
        "settings": {
            "index": {
                "routing": {
                    "allocation": {
                        "include": {
                            "_tier_preference": "data_content"
                        }
                    }
                },
                "number_of_shards": "2",
                "provided_name": "logstash-2023.01.25",
                "creation_date": "1674652291012",
                "number_of_replicas": "0",
                "uuid": "CRFSHrmHRh6SOXcLgN3bnQ",
                "version": {
                    "created": "7100199"
                }
            }
        }
    }
}

Looks great !!!

Thank you so much one more time))

P.S. to tell the truth I didn't undestand how new index knew that it would must use new index template test_1

1 Like

Good to know it resolved your problem.

In index template, you define an index pattern, for instance logstash* in your case. This setting makes sure that any index generated with index name following logstash* pattern must conform to test_1 index template.
To test this, you can create another index, say test_index, but it won't be created with settings and property mappings specified in your test_1 template.

1 Like