I'm not sure if there is a hard upper limit, but I can share a terrible experience I had while trying to troubleshoot someone's 1.7 cluster three weeks ago. Though it had many problems including severe memory pressure, the biggest problem seemed to be that the cluster state wasn't getting propagated in a timely manner.
curl -XGET 'http://localhost:9200/_cluster/pending_tasks?pretty'
{
"tasks" : [ {
"insert_order" : 14670,
"priority" : "HIGH",
"source" : "update-mapping [terrible_index][result] / node [3TMP9qc7S-SXjr4leMpyKQ], order [1]",
"executing" : true,
"time_in_queue_millis" : 270763,
"time_in_queue" : "4.5m"
}, {
"insert_order" : 14671,
"priority" : "HIGH",
"source" : "update-mapping [terrible_index][result] / node [3TMP9qc7S-SXjr4leMpyKQ], order [2]",
"executing" : false,
"time_in_queue_millis" : 269696,
"time_in_queue" : "4.4m"
},
...
In https://www.elastic.co/guide/en/elasticsearch/guide/current/_pending_tasks.html, I found this tidbit:
Since a cluster can have only one master, only one node can ever process cluster-level metadata changes. For 99.9999% of the time, this is never a problem. The queue of metadata changes remains essentially zero.
In some _rare_ clusters, the number of metadata changes occurs faster than the master can process them.
Imagine my dismay when the queue I was looking at took 45 minutes to empty.
That index turned out to be pretty small at just 1100 documents. But the mapping JSON was 4 million lines long and growing. I deleted the index and everything has been green since. This cluster still has several hundred indices and ~3K shards, but I haven't seen any pending cluster tasks anymore.
So my pointer would be to check _cluster/pending_tasks
periodically to ensure your cluster isn't in the 0.0001%.