Latency issue

Hi everyone,

I'm an IT student, and i work for a company. My internship is to develop a solution to log, analyze and visualize the event logs.

I do not have many financial resources for this project. So I set up a POC on 3 servers of the ELK stack .

The hardware configurations of my servers are as follows:

  • 2 CPU, Intel(R) Xeon 2,27GHz (4 cores)
  • 24 Go RAM
  • 1,5 To SAS disk

Each server has this configuration.

My ELK stack is composed of Redis, Logstash, ElasticSearch and Kibana.

Here is my architecture:

Logstash heap size is 1G
Elasticsearch Client heap size is 6G
Elasticsearch Master heap size is 2G
Elasticsearch Data heap size is 4G

I have 5 shard, 1 replica.

Today I get the logs of a dozen servers. In 4 months I have 3 billion ( 2,05To ) documents in my cluster.

I plan to retrieve the logs to hundred servers.

In recent days I have observed a slowdown of performance in my cluster. Kibana crash often, the informations takes a long time to appear, etc.

I wanted to know how I can tell if the data load is too large for my cluster.

What tests should I perform ?

thanks for the help :grinning: