How to Handle Large Indices when non-time series data

Hello Everyone:

We are using Elasticsearch v7.8.0 and some clusters of version 8.10.4.

We are having Indices storing 40 millions of records in each , having shards -5 primary shards at the time of each index creation.

As we do not have time-series data , We cannot use ROLLOVER index feature , as we are in need if require to update / delete old data of index.

Hence would like to understand , how to manage such huge indices in terms of how to increase primary shards - without downtime as data pushing is continuous.

Anything other feature available in Elasticsearch like ROLLOVER to prevent single index from getting too much big in size.

Currently our each shard is of 35 GB.

Thank You

The only mechanisms available in Elasticsearch to increase the number of primary shards is reindexing and the split index API, bith of which requires downtime as new indices are created.

Other solutions depend on the type of data you are inedexing and how you are indexing it. You may be able to add logic to your indexing tier to keep track of which index each document goes to. This way you can create a new index when needed and start routing new documents to this. This can be seamless and be done without downtime, but require changes to your application and ingest pipeline.

Hello @Christian_Dahlqvist

Thank you for your response.

Could you please help me with an example [ URL LINK ] to understand how to maintain the track of which doc goes to which named-index in ES cluster ..

I am very much new to Elasticsearch development side.

Much appreciated your response.

Thank You

There is nothing in Elasticsearch that supports that so it is something you will need to build yourself. I suspect how you do that depend on where the data comes from and how you update it. If you for example are storing data related to customers you may add a parameter to each customer outsode of Elasticsearch indicating the table its data is stored in and then use this to send data related to that customer to the correct table(s).

Okay. Understood !

Thank you @Christian_Dahlqvist

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.