Hi,
I have a Elasticsearch cluster that is running into a high shard count, which is obviously making searching slow. There is 110 Million documents, each document averaging 6KB, total storage is at 350GB. Documents are upserted constantly (we see new documents almost constantly so we need to upsert), during upsert we have seen that we can reach between 5000 and 6000 segments, which during search is too slow, obviously. There are 58 indexes each with 1 shard and 3 replica setup. We would prefer not to force merge as that would also slow down search speed.
We have tried setting the following on all indexes that we have:
{
"index.merge.policy.max_merge_at_once": 4,
"index.merge.policy.max_merge_at_once_explicit": 4,
"index.merge.policy.max_merged_segment": "30gb",
"index.merge.policy.segments_per_tier": 4,
"index.merge.policy.floor_segment": "20gb"
}
To see if that would merge the segments together but didn't work, we have force merged to double check that the problem was to do with segment count and it definitely was, so if that could be done by Elasticsearch without us having to do it ourself it would be perfect.
The setup we have is the follow:
6 data nodes each with 50GB Memory and 25GB given to JVM, each data node has four 375GB NVME disks attached.
3 master nodes each with 25GB memory and 12.5GB given to JVM, each master node has a 30GB pd-ssd.
3 Coordinating nodes each with with 25GB memory and 12.5GB given to JVM, no disks.
3 Ingest nodes each with with 25GB memory and 12.5GB given to JVM, no disks.
Any help on this would be greatly appreciated.