добрый день, я в данной теме новенький и хочу уточнить пару моментов:
- правильно ли я понимаю что если у меня есть 2 дата ноды в кластере и каждый индекс имеем 1 праймери шард и один реплика шард то при выходе из строя любой дата ноды потери данных не произойдет ? т.к. у нас как бы "зеркалирование" данных:
"number_of_data_nodes" : 2,
"active_primary_shards" : 182,
"active_shards" : 364,
- можно ли добиться распределения данных по дата нодам не дублируя их? Т.е. более важна скорость работы кластера чем сохранность данных. Если говорить про 3 дата ноды, то вместо 1.5Т данных на каждой дата ноде иметь по 500G на каждой и при выход любой ноды 1/3 часть данных просто не будет отображаться в кибане, но на работе кластера не отразиться.
Hello @freeman999 , it would be really easy for us to understand the issue if it's written in English.
Nevertheless, for your questions:
- Yes, your understanding is correct. In case of node failure, replica shard will be marked as primary shard.
- If you don't want replica shards, you can always update your index template or log shipper and define
number of primary shards = number of data nodes
and replica_shard_count=0
.
1 Like
@Ayush_Mathur Thank you so much for you answers !!!
Please tell me where I can read about how to update my index template or log shipper and how to define the variables you've written about ?
Thank you again)
@freeman999 you can read about statuc and dynamic index template settings here: Index modules | Elasticsearch Guide [8.6] | Elastic
For log shipper, let's say filebeat, the index settings can be found here: Configure Elasticsearch index template loading | Filebeat Reference [8.6] | Elastic
1 Like
Hello, @Ayush_Mathur
Thank you for your help!
Speaking of log shipper, we're using fluent-bit.
curl http://192.168.101.111:9200/logstash-2023.01.25/_settings?pretty
{
"logstash-2023.01.25" : {
"settings" : {
"index" : {
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"number_of_shards" : "1",
"provided_name" : "logstash-2023.01.25",
"creation_date" : "1674645824236",
"number_of_replicas" : "1",
"uuid" : "GcluE-DyRBGoMdXB-D7u9w",
"version" : {
"created" : "7100199"
}
}
}
}
}
And it seems fluent-bit does not have index template where we can change number_of_shards and number_of_replicas. Should I create own Elastic Search index template ?
Yes, in case of fluentd and fluentd-bit, you cannot define template and index settings unfortunately.
In this case, create an index template in Kibana (essentially stored and followed by ES) where you can specify index settings.
1 Like
@Ayush_Mathur
I've sent PUT to my ES:
192.168.101.111:9200/_index_template/test_1
{
"index_patterns" : ["logstash*"],
"priority" : 0,
"template": {
"settings" : {
"number_of_shards" : 2,
"number_of_replicas": 0
},
"mappings" : {
"_source" : { "enabled" : false }
}
}
}
I've restarted my fluent-bin agent and new indexes look like:
{
"logstash-2023.01.25": {
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "2",
"provided_name": "logstash-2023.01.25",
"creation_date": "1674652291012",
"number_of_replicas": "0",
"uuid": "CRFSHrmHRh6SOXcLgN3bnQ",
"version": {
"created": "7100199"
}
}
}
}
}
Looks great !!!
Thank you so much one more time))
P.S. to tell the truth I didn't undestand how new index knew that it would must use new index template test_1
1 Like
Good to know it resolved your problem.
In index template, you define an index pattern, for instance logstash*
in your case. This setting makes sure that any index generated with index name following logstash*
pattern must conform to test_1
index template.
To test this, you can create another index, say test_index
, but it won't be created with settings and property mappings specified in your test_1
template.
1 Like