Can i know is it good to have a single index per environment or 1 environment to have 5 index based on team. based on each team they have multiple logs. Besides that i have 3 Elasticsearch cluster, will the data spred duplicated for each cluster? So i will have smae data for all 3 cluster? incae one of it is down, the still there is not loss in data. I read about shards which it split index into shards but i really dont understand it
The logs for each team not same format for all. Around 10 group of logs same format,5 group of logs same format
I have question since index split into shards and shard is split equally to cluster. If 1 cluster is down, it will impact the data right. I mean loss of data. For example 1 index split into 3 shards. So each shards have their own partition of data. So 1 share got to each cluster. If 1 cluster physical storage corrupted, left 2 shards. 2 share means, the data in the order shards loss? Since the data in index is split to 3 shards
The indices live in a cluster, and those indices are made of shards. Those shards are distributed across nodes. You don't split indices over clusters, you split them over nodes in a cluster.
So if one node in the cluster goes down, as long as you have copies of the data on other nodes in the cluster - which is the default behaviour of Elasticsearch - then you don't loose data.
each container have 1 index. each index in each container will have few shards and replica. Correct friend?
So can I create each team to have 1 index. so i have total of 5 index. Each index consist collection of logs from different file which may be same pattern some nd some different pattern. is this correct approach as earlier i plan group by same pattern logs but it will create more index. More index means more shards which means more space
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.