What is the best mix of index and shards on Elastic Cloud

kwoxer · June 24, 2021, 9:03am

As the title says, we are using Elastic Cloud with 3 Elastic Instances.

We are currently having 338 shards:

but just having one index:

Is this a proper way? Or should I create one index for each day maybe? (btw before we migrated to Cloud we had one index per month)

What is the best practice for Cloud?

I don't want to struggle later because a huge index is making the system very slow because freezing is not possible for instance.

So give me any tips you have on this. Thanks.

warkolm · June 27, 2021, 10:46pm

This is using ILM, which you can tell by the counter on the end of the date in the index name. This is the default of Beats these days and what you should be using.

What version are you running?

stephenb · June 28, 2021, 12:27am

@kwoxer

You might want to take a look at this article about shard sizing.

First Glance that is a lot of shards for a cluster of that size we typically recommend 10-20 shards per 1 GB of Heap. Your total cluster has ~1.5GB of heap (50% of 3GB Total) so you should only have ~30 Shards Total you have 10x that, that will negatively affect your performance.

It's possible this is contributing to your issue in your other thread.

kwoxer · June 28, 2021, 3:54am

I understand. So I try to reduce shards by 10x and having a look again on the performance issues I currently have.

ropc · June 28, 2021, 4:06am

In addition to what my awesome colleagues shared, you can also take look at:

You also have the possibility to scale-up your deployment if more resources are needed (c.f Do you know when to scale?).

kwoxer · June 28, 2021, 4:09am

Currently I am getting this message kind of everywhere

Could you tell me the query to reduce the number of shards?

ropc · June 28, 2021, 4:17am

Your deployment is currently running on JVM heap usage, hence the parent circuit breaker exception observed.

A couple of options here:

If you have unwanted / old indices that you are not using anymore, you can use the Delete API to delete these indices. Each index has 1 or many shards (depending on your index settings). Deleting indices can be done for example via the Kibana Console or via the API Console (c.f Access the Elasticsearch API console).
Scale-up your deployment to allocate more resources.

kwoxer · June 28, 2021, 5:21am

I changed the number of shards in the index management. Is this really so heavy so that now Kibana is not available anymore?

grafik

How long does this take approx? Is there something I can do?

stephenb · June 28, 2021, 6:24am

There is only a couple very special command (shrink/split) to change the number of primary shards or combine indices.

Neither of which is available through Kibana

The number of primary shards for an index is set at create time.

what exactly did you do?

kwoxer · June 28, 2021, 6:27am

I cannot show you. As Kibana is down. I was in index Lifecycle management of the Index and edited the Hot Phase. There I enables the Shards section and set the number to 30 shards and saved.

Now getting those errors...

Btw now I get this message

But clearing cache does not help.

stephenb · June 28, 2021, 6:34am

ILM is not really the correct way to attempt to fix that. And you don't set the number of shards in ILM You set the index size before rollover so I am not sure what You really did.

If I were you I'd open a support ticket.

Once you get the cluster up and running you really should have left the default settings for the ILM an index templates which are

The defaults recommended for time series data are:
One primary shard
One replica shard
And in the hot phase of ILM
50 GB per index for rollover

However your cluster is tiny,

Since you only have 30 gigabytes storage of disc on each node obviously 50 GB shards won't work

You can make each index size 2GB or so. That would give you 15 or so indexes / shards per node.

So depending on what your actual data ingest and retention requirements are you may need to adjust the capacity / the size of your cluster.

kwoxer · June 28, 2021, 8:34am

And what should be the Maximum primary shard size? These settings are fine?

kwoxer · June 28, 2021, 8:42am

And where do I see where these shards are located or coming from?

grafik

stephenb · June 28, 2021, 2:21pm

Hi @kwoxer So you got the cluster green again?

Did you actually look at / read all the great docs that @ropc and I shared? There was a lot of great info in those.

IF you left the defaults for number of Primary and Replica shards to be 1 and 1 respectively then I would put that setting in ILM to 2 GB.

BTW if you changed the number of Primary shards in your mapping or index template you should set it back to 1 or take it out. If you do not set it 1 is the default.

Again your cluster is on the very small scale so these numbers are all a bit skewed.

With respect to the shards, where they are located they should be pretty much evenly distributed across the 3 nodes. I would not get too concerned about where they are

You can go to Kibana -> Dev Tools and run

GET _cat/shards/?v

To see where they are .

I highly recommend taking a look at the docs we sent, AND Elastic provide a lot of free training and webinars I would take advantage of those.

Good Luck!

kwoxer · June 28, 2021, 2:27pm

Yes I read them all. But actually nothing dropped the pressure sadly.

So the shards are now good. Just 86 shards anymore.

But pressure is still at a high level:

So the only way now is upgrading the server or running it on-prem, correct?

stephenb · June 28, 2021, 2:38pm

Its going to stay a bit high with such small nodes... quite literally there is only 512MB jvm on each node.. that is they smallest you can possibly run.

Remember, the goal is about 10-20 Shards per 1 GB JVM : you have 1.5GB JVM so the goal would be about 30 Shards to you are still 2X. Again these are not hard rules, and with such a limited amount of JVM RAM it going to be tight.

And in general your indices are still extremely small, each indices / shard takes memory space in the JVM.

I would suggest scaling the cluster up to 2GB or 4 GB nodes. You can do that from the deployment screen just hit edit change the setting and apply, it will take a few minutes it will do a rolling change with no down time.

Me... Smallest I ever run is 4GB nodes, but that is just me, you can try 2 GB nodes first.

You can scale these cluster up to HUGE Terabytes of RAM and 100s of TBs of storages, which you clearly do not need at this point.

No reason to think about on-prem at this point.

kwoxer · June 29, 2021, 8:05am

That sounds good. So now I try to reindex. But getting the known error.

I now enable the autoscale to get the reindex working properly. Hopefully that fixes it.

kwoxer · June 29, 2021, 8:11am

Sadly not. So how to reindex when always getting this circuit error?

So autoscale is not working. Should I manually for the reindex upgrade from 1 GB RAM to 4 GB RAM?

stephenb · June 29, 2021, 1:43pm

Autoscale is based on disk usage today so that is not the correct way to fix this issue. (in the near future it will look at memory and CPU pressure.

You need to manually scale your cluster.

Go into the Elastic Cloud Console and click on your deployment.
Click Edit
Make the changes
And Save at the Bottom.
It will take a few minutes
Then Try your reindex again.

kwoxer · July 1, 2021, 8:03am

Ok upgrading to 4 GB RAM worked perfectly. Now I just reindex everything without trouble.

Afterwards gone back to 1 GB RAM. Then Kibana totally crashed.

Already tried to go back to a snapshot. No change. Kibana internal error all the time.

Topic		Replies	Views
Need advice on shards for my index Elasticsearch	15	942	September 30, 2020
Total shards per node calculation Elasticsearch	5	543	February 21, 2022
Shard size / Index number / server count and performance Elasticsearch	4	1391	July 6, 2017
Too many smaller indices. shards is creating issue Elasticsearch	20	94	October 23, 2024
Index with few shard or index with many shards? Elasticsearch	10	595	January 21, 2019

What is the best mix of index and shards on Elastic Cloud

Related topics