Elasticsearch index and clustering

Hi friends,

Can i know is it good to have a single index per environment or 1 environment to have 5 index based on team. based on each team they have multiple logs. Besides that i have 3 Elasticsearch cluster, will the data spred duplicated for each cluster? So i will have smae data for all 3 cluster? incae one of it is down, the still there is not loss in data. I read about shards which it split index into shards but i really dont understand it

Are they the same format?

Ok so an index can be 1 or more shards. The larger the index the more shards you usually have.

That means that if you have 1 index with 3 shards, on a cluster of 3 nodes, you will usually see each of the nodes will hold 1 shard of that index.

On top of that you have replicas, and each index has 1 set of replicas by default. That means you have 1 copy of each of the primary shards.

And that means you end up with 6 shards - 3 primary and 3 replicas - and they will be balanced out on your cluster.

The logs for each team not same format for all. Around 10 group of logs same format,5 group of logs same format

I have question since index split into shards and shard is split equally to cluster. If 1 cluster is down, it will impact the data right. I mean loss of data. For example 1 index split into 3 shards. So each shards have their own partition of data. So 1 share got to each cluster. If 1 cluster physical storage corrupted, left 2 shards. 2 share means, the data in the order shards loss? Since the data in index is split to 3 shards

The indices live in a cluster, and those indices are made of shards. Those shards are distributed across nodes. You don't split indices over clusters, you split them over nodes in a cluster.

So if one node in the cluster goes down, as long as you have copies of the data on other nodes in the cluster - which is the default behaviour of Elasticsearch - then you don't loose data.

What the difference ya I have 1 cluster with 3 containers. So it is 3 nodes right?

Correct.

so i cn say 1 have 1 cluster 3 container (node)

each container have 1 index. each index in each container will have few shards and replica. Correct friend?

So can I create each team to have 1 index. so i have total of 5 index. Each index consist collection of logs from different file which may be same pattern some nd some different pattern. is this correct approach as earlier i plan group by same pattern logs but it will create more index. More index means more shards which means more space

Is my approach correct?

Each node (container) will hold a shard that is part of an index.

You are better off putting data that is the same, or similar, format into one index.

why u say this ya since index can take in different pattern

Keeping data sources that are dramatically different in the same index is inefficient and becomes hard to manage.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.