Why the number of closed index affects write performance?


I am now testing the write performance of ES(v7.13.4), the number of shards written is constant, I added many empty indexes(shard number is 3). Then I find that the write performance is degraded even if the index is closed. Indices metadata and routing table are all map structures, this does not affect. Please tell me what caused it.

The test results are as follows:
Index number: 5790
Total shard number: 7974
Open shard: 7974

Index number: 10287
Total shard number: 17829
Open shard: 7974

Index number: 22126
Total shard number: 38524
Open shard: 7974

Version 5.X does not have this problem

Why are you creating empty shards and then closing them?
How many nodes in the cluster?

This is a test, because there are many closed indexes when used online.
The cluster has 27 nodes, 3 of which are master nodes.

In recent versions Elasticsearch is keeping track of closed indices, so I assume a large number of closed indices would increase the size of the cluster state more than before. I assume this could potentially slow down cluster state updates, if you create new indices as part of your indexing or use dynamic mappings which result in mapping updates to the cluster state.

Are you using dynamic mappings? What does indexing performance look like if you index data into a few existing indices in a static format that does not result in new mappings? Why do you have so many closed indices in your cluster in the first place?

I tested the write performance after the index was created, and dynamic mapping is false.

We opened the index for 3 days and kept the data for 14 days. If we need to search the data 3 days ago, we only need to open these indexes

Memory consumption related to shards have shrunk in recent version compared to e.g. Elasticsearch 5.x, so if resource usage is the reason you are closing indices this may no longer be required.

Creating lots of.small indices and shards are however quite inefficient. Why are you creating so many small indices s as no shards?

Can this new feature be made into a parameter so that we can choose

No, it is not configurable.

Thank you, then we can only change it ourselves.

It sounds like you are using Elasticsearch in an unusual way that is not necessarily following best practices. If you could describe your use case the community might be able to find a better way for you to shard and manage your data.

If I recall correctly the change was introduces around the time when frozen indices were introduced. Before this closed indices were not at all monitored by the cluster and a node failure would not cause closed indices to get replicated to new nodes. For some use cases that relied on closed indices for handling very old data a couple of nodes failung could lead to the loss of all shard copies for specific shards, resulting in permanent data loss and red indices.. This is what I believe the change aimed at correcting.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.