Does a lot of indices in elasticsearch cluster have an impact on performace of elasticsearch

I am wondering if an elasticsearch cluster has a lot of indices (may be 100 or 1000),does the elasitcsearch cluster have performance issue ?

It's more the number of shards that can cause issues.

More shards with small cluster ?

Lots of shards, irrespective of the cluster size.

Thanks

What do you consider "Lots of shards"? Would 256 indices on ten servers with 5 shards and 1 replica qualify? That is 2560 shards. I also have a second cluster that has well over 20k shards. I agree that too many shards is a bad thing but I have never seen anything discussing the number of shard in relation to the number of indices or number of nodes / servers. I would assuming 200,000 shards on a cluster of 100 nodes is different than on 20 nodes. Thoughts?

I think that should be fine. I ran ES 1.x with 1600 indexes with mostly 1 shard and 2 replicas. Ended up with something like 6000 shards on 21 servers. Cluster state updates took some time but push through but it wasn't bad. I noticed it though. In 2.x Elasticsearch got cluster state diffs which really really improved performance for large numbers of shards so I expect that to be fine.

Large numbers of shards can cause pending cluster state tasks to wait for a long time. So the thing to watch is the age of the oldest task in this API. The actual count of tasks is fairly bursty and not really a big deal because lots of then can be processed at once.

Since cluster state is replicated to all the nodes larger cluster states are worse for larger clusters. Less bad in 2.x than in 1.x. Much less bad, actually. But it is still a think.

There are also per shard and per field overheads that only exist on the nodes hosting the shards. These don't tend to come up as a problem nearly as often when you have many indexes as cluster state maintenance.

1 Like