Hi everyone
Elasticsearch was complaining that we've reached max amount of shards -- we've increased max_shards_per_node from default 1000 to 2000 and later to 5000.
and then we noticed hundreds of empty indices with 0 documents.
Investigation showed that Fleet is running rollover on all APM indies without any (obvious for us) reason.
currently we are on index-000080, even if index was create few days ago, is less than 10 GB
{"type":"audit", "timestamp":"2023-09-05T11:33:39,022+0000", "node.id":"XXX", "event.type":"transport", "event.action":"access_granted", "user.name":"elastic/fleet-server", "user.realm":"_service_account", "user.roles":["elastic/fleet-server"], "origin.type":"transport", "origin.address":"XXX", "request.id":"XXX", "action":"indices:admin/auto_create", "request.name":"CreateIndexRequest", "indices":["traces-apm"], "x_forwarded_for":"XXX"}
ILM is stopped ( as first idea was that ILM is getting crazy).
This triggers hot nodes being overloaded and going down.
Fleet just exploded amount of shards to 31K shards (expected amount -- less than 6K).
all new shards have 0 documents, rollover should be on 000001-0000010, but currently is on 000080.
we manually deleted all empty indices, to protect our hot nodes from overloading.
max_shards_per_node was lowered back to 1000 (was 2500 and 5000)., That is also stopping Fleet from rollovering our indices.
How to check why Fleet is triggered?
logs or Fleet UI doesn't show anything related. Just a fact of index creation.