Kibana shows indices in gb size!

Akanksha_Pandey · January 21, 2022, 9:38am

Hi,

My Elasticsearch version 7.9.2. I'm using Filebeat to ship logs directly to Elasticsearch. All the services are running as docker container- Elasticsearch, Filebeat, Kibana

The issue is- some of the Kibana indices have size upto 25 GB !. Also Elasticsearch container is consuming high CPU load constantly.

Can anyone help me how to reduce the index size? Is this the reason behind high CPU consumption by Elasticsearch?

Thanks,
Akanksha

Tomo_M · January 21, 2022, 11:50am

25GB is not too big for even a single shard index. It depends on your CPU and total number of indices and shards, but I don't think it will be a big problem with usual multi-core CPUs.

Are you using Stack Monitoring?

Akanksha_Pandey · January 21, 2022, 1:33pm

I did not enable Stack monitoring but it is present and shows all the stats of indices and shards.

Does stack monitoring impact the Elasticsearch performance?

leandrojmp · January 21, 2022, 1:45pm

What is the specs of the Elasticsearch container? How many memory and cpu?

Also, the size of the indices reflects the number of documents, to reduce it you would need to see if you can drop some kind of message and store less documents.

Are you using dynamic mapping or you created a mapping for your index? Using dynamic mapping can use a lot of storage, so if you can create a mapping for your index you could salve some space.

Tomo_M · January 21, 2022, 1:48pm

No, but it may give a hint to the reason behind high CPU consumption by Elasticsearch.

Akanksha_Pandey · January 22, 2022, 5:15am

Thanks for clearing my doubt- uptil now I was thinking that 25 gb size per index was the culprit behind high CPU consumption.

Yes, I'm using dynamic mapping because I'm monitoring and alerting based on the Elasticsearch logs. Let's say in future some incident occurs then in order to diagnose we don't know which filed will be important or not. That's the reason I'm skeptical regarding manual mapping. Please let me know if I'm missing something

We have a one-node cluster setup. The Elasticsearch heap size is 5gb. Maximum shards per node is 4000 and replica shards is set to 0. Regarding how many CPU- can you explain this.

Also can you please suggest how to reduce the CPU consumption? It'll be really helpful because it is impacting our production servers.

Thanks,
Akanksha

Tomo_M · January 22, 2022, 5:55am

Hi,

This document will help you. 4000 shards for 5gb heap is 40 times more than recommendation. That could be the reason for high cpu consumption.

First, by default of dynamic mapping, string fields are mapped as text with keyword subfield. It consumes twice by a simple calculation.

In addition you can store all data without 'indices (general meaning)' to save index size, and reindex them from _source field when you realize you need it. See doc_values, enabled, index mapping parameters.

Those 2 might not be enough for 40 times number of shards. IMHO, you may need reconsider about whether you can organize some small daily indices to weekly indices and how long you have to keep the indices active. Maybe some older indices could be taken snapshot and deleted. Snapshot consume no CPU power. It cannot be searched but can be restored when needed. ILM will help you to implement automatic processing over time.

Akanksha_Pandey · January 22, 2022, 7:46am

@Tomo_M - sure, thanks!

leandrojmp · January 22, 2022, 2:10pm

If you have a one-node cluster with 5 GB of Heap you should try to keep the number of shards below 100.

There is a recommendation to have a maximum of 20 shards per GB of Heap, with 5 GB of Heap this will give you a maximum of 100 shards, with 4000 shards you are way above this recommendation, your cluster is oversharded and this can impact performance of the cluster and the node. I can not be sure, but this can be the cause of the High CPU.

You should find a way to reduce drastically the number of shards or add more nodes to your cluster.

Another recommendation is to keep the size of shards around 50 GB.

system · February 19, 2022, 2:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
500mb memory used with no data added? Kibana docker	2	107	June 26, 2024
How to get the daily amount of logs in GB Kibana	6	18800	April 16, 2019
High CPU and Memory Usage Elasticsearch docker	2	2519	March 30, 2021
Trouble Handling Large Volume data - Slow Kibana Dashboard Elasticsearch	5	1089	July 24, 2019
Huge monitoring-es indexes Elasticsearch	11	2437	November 7, 2021

Kibana shows indices in gb size!

Related topics