Hello. I use Elasticsearch with Graylog. During this week I noticed that my Elasticsearch server is running out of disk space. It’s configured to store 240 indices. Max_docs_per_index is 20 million documents.
Having checked the size of indices on disk, I quickly noticed that one of them is significantly larger than all others. It contained over 900 mln documents when it finally became inactive. What do you think could be the cause?
It's not the first time it's happening. Last November, one of indices grew to 500 mln docs and then turned inactive and all the next indices didn't exceed the 20 million limit till today.
Graylog 4.2.13 and Elasticsearch 7.10.1 running on Debian.
Thank you, Mark. You gave me a direction. Here's what I found in Graylog log files:
Caused by: org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=validation_exception, reason=Validation Failed: 1: this action would add [4] total shards, but this cluster currently has [997]/[1000] maximum shards open;]
I've set cluster.max_shards_per_node to 3000. Is it a valid solution? Is there anything else I should do? It's a single-node ES instance.
Based on the screenshot it looks like each index has 4 primary shards and generally are around 6GB in size. This is quite small (1.5GB) and will as you see result in a lot of shards. If Graylog limits indices by document count it probably would be reasonable to increase the limit by a factor of between 5 and 10 in order to get the average shard size up.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.