Slow Index Deletion


#1

We have a 2 node cluster with closee to 16000 shards (8000 primary and 8000 replicas). Both nodes have 64GB ram, of which 30GB is allocated to ES. We are using ES version 1.4.2. When we try to delete and index, we notice that it is taking more than a minute and sometimes utpo 2 minutes. It used to be much faster like operation used to complete with in seconds. How do we go about troubleshooting and fixing this issue? Any suggestions are appreciated


Speeding up index deletion
(Aaron Mildenstein) #2

There isn't any troubleshooting, really. The issue is that you have a 2 node cluster with a huge number of shards per node (8000 each). Any time you delete an index, the cluster has to rebalance and rebuild the cluster state. This is especially bad in ES versions < 1.7 as the entire cluster state is pushed to the other nodes with each change. In newer versions, only a delta is pushed, saving network traffic.

If you need things to be faster, you need to reduce the shard count on each node to a much more manageable level. You can do this by deleting indices, or by adding nodes. A healthier number of shards to manage per node would be at or below the 1,000 shards per node level.


#3

Thanks for quick reply Aaron. As a short term measure, if we upgrade to 1.7, are we going to see better performance related this operations?

Thanks
Ashok


(Aaron Mildenstein) #4

Improvement would be there, but I don't know that it would be a big leap in performance. My guess is that the upgrade would shave a few seconds off. But it will not take you from 2 minutes down to 2 seconds.

The fundamentals are just that you've heavily overloaded a 2 node cluster, and that's where the big problem lies.


#5

Thanks a lot, Aaron. Deleting data is not an option to reduce the shards for us. To reduce the shards, we have to re-index the data as well as re-architect our product so that moving forwards too many shards are not created. Since there is no easy way out, i guess we will have to bite the bullet and do the above 2.

Thanks again


(Christian Dahlqvist) #6

Cluster state delta updates were introduced in Elasticsearch 2.0, so version 1.7 will not help you there. Reducing the number of shards is the right way to go though, even if it can be painful.


#7

Thanks Christian. We will reduce the number of shards


(system) #8