We've got some issue that's making us scratch our heads for a while.
We have pretty straightforward setup, 13 nodes, 1 index, 1000 shards, 2 replicas for each, a lot of nested document, high write load and light read load.
We're collecting snapshots every hour and for some reason during snapshots creation cpu is rocketing to 80%+ load and a lot of segments are getting merged (or so it seems according to our metrics). I also noticed some not-so-typical translog behaviour during this time.
Any ideas are welcome, I think we also can provide some more data if it's needed.
Welcome to our community!
What version are you on?
How large is the index?
What monitoring are you using?
Hey Mark, thx for welcoming!
We're on 7.8 version.
Index is pretty large, around 2.3Tb with replicas, so around 800 Gb of primary data spread into 1k shards. Each shard is not that big (around 0.8 Gb).
We're monitoring cluster using Releases · justwatchcom/elasticsearch_exporter · GitHub and dashboards in grafana.
I was looking through sources last night and I found that when a snapshot for a shard starts, it flushes it first (
SnapshotShardsService::snapshot). What's puzzling me is that shards should be flushed regularly so one more flush should not make much difference, but it does and leads to a lot of segments getting merged.
That's a real waste of resources and likely the cause of the CPU spike, as it has to prep those 1K shards.
Thank for a quick reply!
What would be the suggestion here? We had like 20 shards last year and it was quickly getting to that 30-50Gb per shard mark where we had to reindex everything. Turns out that having too many is causing issues too. Are we bound to reindex our data from time to time as index getting larger and larger? I think we could use shrinl api for now to scale it down, but the time for reindex will come sooner or later
I have an idea that this whole thing is becoming pretty unmanageable and we better split our data into multiple indices and add new
What sort of data is it?
Hey, sorry for not replying for a long time!
What characteristics of data do you mean here?
There's a number of root-level documents and each one of them have a lot of nested ones. Text are pretty standard