Index shard allocation for single node Elasticsearch environment

popa · October 9, 2018, 2:56am

I'm currently working with a single node Elastic Stack deployment. System specifications are 2 CPU / 8 GB RAM / 500 GB SSD. I've implemented most if not all optimization best practices (minus index / shard sizing).

I'm starting to notice that queries are taking longer and long as the number of indices and shards grow. We've got 59 indices / 455 total shards / 220 unassigned shards / 28 million documents / 21 GB data.

I've done some reading in the docs and came to the following conclusions:

Since I'm using a single Elasticsearch node, having an index config with 5 shards / 1 replica per shard (shown below), is consuming unnecessary resources.

GET /winlogbeat-6.4.0-redacted/_settings

{
  "winlogbeat-6.4.0-redacted": {
    "settings": {
      "index": {
        "mapping": {
          "total_fields": {
            "limit": "10000"
          }
        },
        "refresh_interval": "5s",
        "number_of_shards": "5",
        "provided_name": "winlogbeat-6.4.0-redacted",
        "creation_date": "redacted",
        "number_of_replicas": "1",
        "uuid": "redacted",
        "version": {
          "created": "redacted"
        }
      }
    }
  }
}

Question: Should I consider shrinking my indices to 3 shards per index with 0 replicas? Or would it be even better to shrink my indices to 1 shard / 0 replicas? I don't anticipate indices to go above 10 GB in size during production.

This would be useful for a single node setup only, I think. When we scale to 3 Elasticsearch nodes I will switch to 3 or 5 shards with 1 replica per shard.

Thanks for reading!

Christian_Dahlqvist · October 9, 2018, 5:13am

Please read this blog post for guidance on shard sizes and sharding. I would recommend going down to a single primary shard per index and perhaps also use the rollover API so each index can coven a time period longer than 1 day, especially if you have a longer retention period.

popa · October 9, 2018, 5:21am

Thanks for the link and recommendations. The link you provided was one of the resources I used to help guide my understanding of indices and shards.

Assuming I understand correctly, it seems the max size for a single shard should not exceed 20-40 GB when dealing with time series data. In my case I'm dealing with about 400~ unique fields (Windows Event Logs / Winlogbeat).

I'll be a bit more conservative since my data isn't time series and aim for 10-20 GB per shard. If my indices exceed that amount, I'll likely want to expand into a multi node cluster and/or use multiple shards per index.

Does this sound correct?

Christian_Dahlqvist · October 9, 2018, 5:46am

That sounds like a reasonable starting point.

system · November 6, 2018, 5:47am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Trying to optimize Elasticsearch cluster Elasticsearch	3	964	February 20, 2017
When do you need more then 1 shard? Elasticsearch	12	1853	July 6, 2017
Correct number of shards for 5.3 TB indices Elasticsearch	10	2152	May 18, 2017
Speed up shard allocation on Elasticsearch 6.x single node cluster Elasticsearch	5	595	February 26, 2019
Shard Optimization - Size vs. Count Elasticsearch	13	359	April 17, 2024

Index shard allocation for single node Elasticsearch environment

Related topics