Elasticsearch index and clustering

selin · August 20, 2021, 3:34am

Hi friends,

Can i know is it good to have a single index per environment or 1 environment to have 5 index based on team. based on each team they have multiple logs. Besides that i have 3 Elasticsearch cluster, will the data spred duplicated for each cluster? So i will have smae data for all 3 cluster? incae one of it is down, the still there is not loss in data. I read about shards which it split index into shards but i really dont understand it

warkolm · August 20, 2021, 4:35am

Are they the same format?

Ok so an index can be 1 or more shards. The larger the index the more shards you usually have.

That means that if you have 1 index with 3 shards, on a cluster of 3 nodes, you will usually see each of the nodes will hold 1 shard of that index.

On top of that you have replicas, and each index has 1 set of replicas by default. That means you have 1 copy of each of the primary shards.

And that means you end up with 6 shards - 3 primary and 3 replicas - and they will be balanced out on your cluster.

selin · August 20, 2021, 4:54am

The logs for each team not same format for all. Around 10 group of logs same format,5 group of logs same format

I have question since index split into shards and shard is split equally to cluster. If 1 cluster is down, it will impact the data right. I mean loss of data. For example 1 index split into 3 shards. So each shards have their own partition of data. So 1 share got to each cluster. If 1 cluster physical storage corrupted, left 2 shards. 2 share means, the data in the order shards loss? Since the data in index is split to 3 shards

warkolm · August 20, 2021, 5:02am

The indices live in a cluster, and those indices are made of shards. Those shards are distributed across nodes. You don't split indices over clusters, you split them over nodes in a cluster.

So if one node in the cluster goes down, as long as you have copies of the data on other nodes in the cluster - which is the default behaviour of Elasticsearch - then you don't loose data.

selin · August 20, 2021, 5:40am

What the difference ya I have 1 cluster with 3 containers. So it is 3 nodes right?

warkolm · August 20, 2021, 6:04am

Correct.

selin · August 20, 2021, 6:39am

so i cn say 1 have 1 cluster 3 container (node)

each container have 1 index. each index in each container will have few shards and replica. Correct friend?

So can I create each team to have 1 index. so i have total of 5 index. Each index consist collection of logs from different file which may be same pattern some nd some different pattern. is this correct approach as earlier i plan group by same pattern logs but it will create more index. More index means more shards which means more space

Is my approach correct?

warkolm · August 23, 2021, 1:34am

Each node (container) will hold a shard that is part of an index.

You are better off putting data that is the same, or similar, format into one index.

selin · August 23, 2021, 8:14am

why u say this ya since index can take in different pattern

warkolm · August 23, 2021, 9:43am

Keeping data sources that are dramatically different in the same index is inefficient and becomes hard to manage.

system · September 20, 2021, 9:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch clustering indices problem Elasticsearch	5	333	August 30, 2019
Shards vs indexes vs cluster Elasticsearch	4	384	July 6, 2017
Elasticsearch Shards/Indices planning Elasticsearch	4	1146	December 20, 2018
Need advice on shards for my index Elasticsearch	15	938	September 30, 2020
Shard in es cluster Elasticsearch	3	257	December 23, 2020

Elasticsearch index and clustering

Related topics