Why the number of closed index affects write performance?

wulixuan · August 23, 2021, 12:10pm

Hello:

I am now testing the write performance of ES(v7.13.4), the number of shards written is constant, I added many empty indexes(shard number is 3). Then I find that the write performance is degraded even if the index is closed. Indices metadata and routing table are all map structures, this does not affect. Please tell me what caused it.

The test results are as follows:
Index number: 5790
Total shard number: 7974
Open shard: 7974

Index number: 10287
Total shard number: 17829
Open shard: 7974

Index number: 22126
Total shard number: 38524
Open shard: 7974

wulixuan · August 23, 2021, 12:14pm

Version 5.X does not have this problem

warkolm · August 23, 2021, 10:09pm

Why are you creating empty shards and then closing them?
How many nodes in the cluster?

wulixuan · August 24, 2021, 2:55am

This is a test, because there are many closed indexes when used online.
The cluster has 27 nodes, 3 of which are master nodes.

Christian_Dahlqvist · August 24, 2021, 3:34am

In recent versions Elasticsearch is keeping track of closed indices, so I assume a large number of closed indices would increase the size of the cluster state more than before. I assume this could potentially slow down cluster state updates, if you create new indices as part of your indexing or use dynamic mappings which result in mapping updates to the cluster state.

Are you using dynamic mappings? What does indexing performance look like if you index data into a few existing indices in a static format that does not result in new mappings? Why do you have so many closed indices in your cluster in the first place?

wulixuan · August 24, 2021, 3:52am

I tested the write performance after the index was created, and dynamic mapping is false.

We opened the index for 3 days and kept the data for 14 days. If we need to search the data 3 days ago, we only need to open these indexes

Christian_Dahlqvist · August 24, 2021, 3:56am

Memory consumption related to shards have shrunk in recent version compared to e.g. Elasticsearch 5.x, so if resource usage is the reason you are closing indices this may no longer be required.

Creating lots of.small indices and shards are however quite inefficient. Why are you creating so many small indices s as no shards?

wulixuan · August 24, 2021, 4:13am

Can this new feature be made into a parameter so that we can choose

Christian_Dahlqvist · August 24, 2021, 4:13am

No, it is not configurable.

wulixuan · August 24, 2021, 4:16am

Thank you, then we can only change it ourselves.

Christian_Dahlqvist · August 24, 2021, 4:21am

It sounds like you are using Elasticsearch in an unusual way that is not necessarily following best practices. If you could describe your use case the community might be able to find a better way for you to shard and manage your data.

If I recall correctly the change was introduces around the time when frozen indices were introduced. Before this closed indices were not at all monitored by the cluster and a node failure would not cause closed indices to get replicated to new nodes. For some use cases that relied on closed indices for handling very old data a couple of nodes failung could lead to the loss of all shard copies for specific shards, resulting in permanent data loss and red indices.. This is what I believe the change aimed at correcting.

system · September 21, 2021, 4:21am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Too many closed indices influence the Elasticsearch performance? Elasticsearch	4	387	September 14, 2018
Is there any performance impact on Elastic searches when you have a lot of writes on index? Elasticsearch	5	1892	February 17, 2020
Index creation performance Elasticsearch	13	587	July 6, 2017
Bulk indexing performance Elasticsearch	10	4444	February 10, 2017
Indexing performance Elasticsearch	6	367	July 6, 2017

Why the number of closed index affects write performance?

Related topics