ELK elasticsearch explosions, shard allocations

I've essentially inherited this ELK setup and there appear to be some serious problems with the current config. As far as I can see, a new indice is created daily. 5 shards are allocated to the indice with one replica (though there is only one node so the additional 5 shards sit in 'making my health forever yellow' limbo). With only a week or two worth of indices everything seems to work great. After they start to pile up, queries over only the last 24 hours that ran fine before start to cause shard failures. Eventually the whole thing explodes, dumps all the indices(data included), and I'm back to having decent functionality (concerning newly gathered data) until they stack up high enough again. I'm really not concerned about keeping data in elasticsearch longer than a week as all the logs are backed up elsewhere; is it a normal practice to add an indice every day and (maybe this is what's missing) delete the oldest indice for logs? Should the additional indices be affecting queries that only look at two indices (talking about my 'past 24h query' performance deterioration as more indices are added)?

Hi,

By default Elasticsearch uses the scheme of 5 primary shards and 1 replica. You should change this to 1 primary and 0 replicas for a single node cluster. Whats making your cluster state "yellow forever" is the 1 replica. On each index can reduce the replicas to 0 and the cluster will go green. But that's really not causing the problem. Whats causing the problem is having 5 primary shards. And lets say you're using Kibana to search logstash-*. And you have 30 days of indexes. That's 30 * 5 = 150 shards. So 150 queries get executed at one time. Because Elasticsearch needs to find the results for each shard even though you may have only searched for today's data. In Kibana under index settings there is a check box called "Use event times to create index names". Make sure this box is checked and create a new index pattern. Then recreate any visualizations & dashboards with this new index pattern.

https://www.elastic.co/guide/en/kibana/4.1/settings.html#settings-create-pattern

Lastly you should use curator to delete any old indexes you don't want:

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/delete.html

You can setup a cron job on the Elasticsearch server to delete any index older then 7 days:

curator --dry-run delete indices --older-than 7 --time-unit days --timestring %Y.%m.%d

Set your primary shards to 1 in your elasticsearch.yml:

https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#_static_index_settings

1 Like

Awesome info! Thank you so much!

Side note: Any one day might contain somewhere in the ballpark of 80,000,000 entries. Do you still recommend 1 shard or is there a rule of thumb for this sort of thing?

It depends on how big the shard size is in terms of GB. But you can also split things out based on index name and even create queries searching multiple indexes like:

GET /myindex,yourindex/_search

If you're capturing different log data like syslog, tomcat, apache, etc you can create a separate index for each application. You don't need to store logs in one index. But if you have only one node, I don't think there is many significant advantages to using multiple shards on one index. Having multiple shards allows you to spread your data across multiple machines to allow you to scale horizontally.

https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html#_shards_amp_replicas

1 Like

Again, great info. Thank you.