How to achieve good performance (huge daily indices)

Hello,

Our free text search is unusable. We have a 3 master + 4 data/ingester setup.

We have 1 primary shard and 1 replica per index.

Indices are appended by date, and a new one is created every day (typical).

The daily indices are huge (>400GB).

Furthermore, the data plane resources are not being used efficiently. 2 of the data nodes are maxing out their requested (kubernetes) 3 CPU cores and utilizing much more disk space than 2 other data nodes that sit idle and are using less disk space.

What are we doing wrong? Should we increase replicas, primary shard count, both? Change our rollover strategy?

400gb for one single shard is probably too much.

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

1 Like

Thanks @dadoonet,

So should we reduce our index size? These are daily indices which seem to be recommended as per your slide deck.

Does 1 primary + 1 replica seem right? Or should we increase primaries or replicas?

I understand your slide deck, more primaries increase write performance but more replicas increase read performance, is that right?

Correct.

I'd probably try with 400/50 primaries. So 8 primaries.

1 Like

Would rolling over at 50GB per index be an alternative? 1 primary + 1 replica with each index being no more than 50GB?

We normally advice to keep shard size between 20 to 50 gb. But it depends on your use case. So you need to test it.

1 Like

Alright, we will try 8 primaries with 1 replica each, thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.