Elastic Stack Architecture

Hello everyone,

/!\ NB: First I would say that I am really really sorry for the ASCII Art but I can't upload images since this morning and I don't know why. :sweat_smile:

I read a lot about how an Elastic cluster should / could be set up. I watched a lot of videos too but I fear that I missed something before going on an installation. This is why, I want to share with you what I would do and maybe I can receive some advices about it.

On a test period, my estimation on the quantity of daily events is about 13GB. I plan to add some logs sources so let's say that I need 15GB daily. I need to keep these logs for a period of 6 months, so :

Total Capacity needed = Daily logs quantity X retention period
Total Capacity needed = 15GB X (30x6) = 15GB X 180
Total Capacity needed = 2 700 GB = 2,7 TB

Let's round up to 3TB to be sure I will never have storage issues.

NB : Currently I have to delete my indices manually because I am on a single node for testing and I have only 300 GB of storage so I keep my logs less than 1 month.

Now I would build a real architecture with the Elastic Stack. Here is how I imagined it

Servers CPU DISK RAM
Elasticsearch - Node 1 4 or 8 cores 1 TB 16
Elasticsearch - Node 2 4 or 8 cores 1 TB 16
Elasticsearch - Node 3 4 or 8 cores 1 TB 16
Logstash 8 cores 20 GB 4 - 8 GB
Kibana 2 cores 20 GB 4 GB
                 ________________________________________________________
Logstash        |CLUSTER                                                 |
  .----.        |                                                        |               
| == | |        |                   Elasticsearch - Node 1               |                                                             
|    | |        |                         /         \                    |
| == | | =====> |                        /           \                   | <=== https://kibana:5601
|    | |        |                       /             \                  |          
|::::| |        |                      /               \                 |                   
|___.| |        |  Elasticsearch - Node 2  <--->  Elasticsearch - Node 3 |                                                
                |                                                        |
                ----------------------------------------------------------

I started with a homogeneous architecture because I think it meet up to my needs (I have some questions to be sure of that). I didn't plan to set replicas because it will cost to much storage (3TB X 2 = 6) and I won't be able to undertake that. I have some doubts on the architeture so here are some questions :

1 - If I omit node roles, each node will take on every roles but there will be one master at time. Let's assume node 1 is master, to which data node (2 or 3) the logstash will send data ?

Is there any rule like "Node 2 have currently more storage so it will receive the data" or as the cluster is synchronized shards will be shared between the 2 data nodes ?

2 - When do the roles change knowing each nodes can be master (and if none of them fail)? Is it possible that node 1 stay "master node" for 2 months so all the storage on it will be unused ?

3 - Given that it is not a Hot-Warm architecture, but each node can take hot or warm role, if I create a Lifecycle policy for my indices with hot and warm phases before deleting, is it going to work correctly to meet up my needs of 6 months of retention ?

Sorry that's a lot of questions but despite all topics that I read or videos that I watched there are some features that I don't understand properly.

Thanks in advance !

With logstash elasticsearch output plugin, you set the hosts and you can choose which node to output.

This question is about shard allocation. Each shard is allocated to a single node and not shared among nodes. Primary shard allocated to a single node, and its replica is allocated to other nodes if you set at least 1 replica.

master is elected by voting. voting is scheduled by each master-eligible nodes. I'm not sure you can turn off voting or manipulate the voting result. You can set only one master-eligible node to fix the master at the expense of fault-tolerance.

1 Like

Ok, according to the documentation if I set multiple Elasticsearch nodes in logstash output it will automatically load balance across specified hosts.

But If I have 3 nodes and all of them are master-eligible + they can store data, logstash may send bulk requests to the master without knowing who is the master no ?

I read in documentation that it is not recommended that a master node stores data too but I saw an architecture like this here (first illustration) : Homegeneous Architecture

Ok I think I understand better this notion of shards thanks !

I think if I have more precision on this :

But If I have 3 nodes and all of them are master-eligible + they can store data, logstash may send bulk requests to the master without knowing who is the master no ?

It will answer help me to decide if I set one master and two data nodes as you said at the expense of fault-tolerance or if I can use an architecture like the link above with 3 nodes which take on every roles.

Sorry if I'm repeating myself a bit, but I'd really like to get to the bottom of this.

Thank you very much for your answers !

If I omit node roles, each node will take on every roles but there will be one master at time. Let's assume node 1 is master, to which data node (2 or 3) the logstash will send data ?

As already said, Logstash will send data to the hosts configured in the output, you could see which one of your nodes are the master and leave this node out of the configuration, but if the masters changes for some reason, you would need to change your logstash configuration.

Is there any rule like "Node 2 have currently more storage so it will receive the data" or as the cluster is synchronized shards will be shared between the 2 data nodes ?

The balancing is done using shards, the cluster will try to keep the same number of shards in all the data nodes.

When do the roles change knowing each nodes can be master (and if none of them fail)? Is it possible that node 1 stay "master node" for 2 months so all the storage on it will be unused ?

A node that was elected as master will be the master node until the service fails or is restarted/stopped, if you want to change the master node you will need to stop the Elasticsearch service in the current master node.

Given that it is not a Hot-Warm architecture, but each node can take hot or warm role, if I create a Lifecycle policy for my indices with hot and warm phases before deleting, is it going to work correctly to meet up my needs of 6 months of retention ?

With 3 nodes only you will not have a hot-warm architecture, you do not need a warm phase in your lifecycle policy, just the hot one and configure it to delete data after some time.

Also, since you said that you don't plan to set replicas, your cluster will not have any kind of fault-tolerance, if you lose one of your data nodes, your cluster will be in the Red state and some index will be unavailable until the lost node comes back to the cluster, with that in mind, sometimes it is better to have a big single-node cluster than a three medium/small node cluster without replicas.

With a small cluster like this, it would be easier to let all nodes have all the roles, the master nodes can receive bulk requests without any problem and in a lot of use cases this has no impact in the performance, the recommendation to have master-only nodes is better applied on bigger cluster, it would make no sense to have 3 master-only nodes for a 3 data node cluster, it would just make everything more expensive.

Also, 15 GB of logs per day is not that much, one of the recommendation of elastic is to have the shard size somewhere near 50 GB, you would need 3 days of data to reach this size.

Another thing that you need to look is the size of your disks, is that 1 TB disk only for Elasticsearch data or it is shared with the operating system? If you need to store something near 3 TB, you will need more than 3 TB of disk size as Elasticsearch has some watermark threshold to avoid the disks filling up.

The watermark threshold are explained here.

2 Likes

Thank you very much for these explanations, you have cleared my mind.

The watermark threshold is something I had forgotten to think about, thank you for the reminder. Indeed, on this illustration the 1TB disk is shared with the operating system. I will adjust this to avoid any issues.

I read a lot of examples and got a bit lost in my need, in addition to being confused about certain concepts but now It is more clear.

Thanks to both of you again, I think I'll be able to set up a test environment soon

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.