ELK cluster is not stable due to several issues related to shards and JVM configuration

Dawood · August 11, 2021, 12:30pm

Hi,
First it is not the first time i am sharing the issues that i am encountering and hopefully this time somebody will help me with that !!!
So i have an ELK cluster that is running on premise on a rancher k8s based on 3 VMs with the following resources: 32 RAM and 8 CPU and unlimited storage ( volume mount).
The cluster is created to save automation python logs that are directed to Logstash APP already structured using logstash_async module. Every python script creating new unique index and its logs will create new indices with same index created which could be hundred of logs.
I deployed logstash ( 3 replicas),elasticsearch(3 replicas) and kibana(1 replica) using Helm Charts ( I can share them if needed).
I did not configure anything in any chart something that related to shards.
The application working fine for a period of time let's say one month and then everything starts to break mainly in the elasticsearch pods:

total shards, but this cluster currently has [3000]/[3000] maximum shards open;"}}}}
> 7/28/2021 6:09:42 PM [2021-07-28T15:09:42,731][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"qascreencapture-1627484927937893928", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0x34f50dc7>], :response=>{"index"=>{"_index"=>"qascreencapture-1627484927937893928", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [3000]/[3000] maximum shards open;"}}}}

I fixed that by the following :curl -XPUT -H 'Content-Type: application/json' 'IP-OF-ELASTIC-SERVER:9200/_cluster/settings' -d '{ "persistent" : {"cluster.max_shards_per_node" : 5000}}'

I am pretty sure that this is not the way to correct to do that and there some other solutions that i need here your suggestions. Then after that everything worked well till i got new errors related to JVM and HEAP but mainly related to the change i made with shards.

Please help me.

10x,
Dawoods

Christian_Dahlqvist · August 11, 2021, 12:41pm

This seems to be a very inefficient and wasteful way to index data into Elasticsearch and will not scale well each shard has overhead and increases the size of the cluster state. Please read this blog post on sharding and find a way to not create a new index per script for low volumes of data. If you want to store data for a long time or expect large volumes you need to make sure that your shard sizes are at least a few tens of GB in size.

Dawood · August 12, 2021, 6:25am

Thanks
I will change the way i am indexing my logs.

system · September 9, 2021, 6:26am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Shards count issue between logstash and elasticsearch Elasticsearch	2	739	October 12, 2019
My kibana has nothing but i think is elasticsearch problem Elasticsearch	11	2113	August 21, 2018
Sharding Issue Elasticsearch	13	590	November 20, 2018
35 shards but maxing out JVM heap Elasticsearch	12	4324	April 5, 2018
Problems in my Cluster Elasticsearch	12	1196	June 20, 2017

ELK cluster is not stable due to several issues related to shards and JVM configuration

Related topics