QueueResizingEsThreadPoolExecutor exception

At some point during 35k endurance load test of my java web app which fetches static data from Elasticsearch, I am start getting following elasticsearch exception:

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.common.util.concurrent.TimedRunnable@1a25fe82 on QueueResizingEsThreadPoolExecutor[name = search4/search, queue capacity = 1000, min queue capacity = 1000, max queue capacity = 1000, frame size = 2000, targeted response rate = 1s, task execution EWMA = 10.7ms, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@6312a0bb[Running, pool size = 25, active threads = 25, queued tasks = 1000, completed tasks = 34575035]]

Elasticsearch details :

Elasticsearch version 6.2.4 .

The cluster consists of 5 nodes. The JVM heap size setting for each node is Xms16g and Xmx16g . Each of the node machine has 16 processors.

NOTE : Initially, when I got this exception for the first time, I decided to increase the thread_pool.search.queue_size parameter in elasticsearch.yml , set it to 10000 . Yes, I understand, I just postponed the problem to happen later.

Elasticsearch indicies details: Currently, there are about 20 indicies, and only 6 are being used among of them. The unused one are old indicies that were not deleted after newer were created. The indexes itself are really small:

Index within the red rectangle is the index used by my web app. It's shards and replicas settings are "number_of_shards": "5" and "number_of_replicas": "2" respectively. It's shards details:

In this article I found that

Small shards result in small segments, which increases overhead. Aim to keep the average shard size between at least a few GB and a few tens of GB. For use-cases with time-based data, it is common to see shards between 20GB and 40GB in size.

As you can see from the screenshot above, my shard size is much less than mentioned size. Hence, Q: what is the right number of shards in my case? Is it 1 or 2 ? The index won't grow up much over the time.

ES Queries issued during the test. The load tests simulates scenario where user navigates to the page for searching some products. User can filter the products using corresponding filters (for e.g. name, city, etc...). The unique filter values is fetched from ES index using composite query. So this is the first query type. Another query is for fetching products from ES. It consists of must , must_not , filter , has_child queries, the size attribute equals 100.

I set the slow search logging, but nothing had been logged:

"index": {
                "search": {
                    "slowlog": {
                        "level": "info",
                        "threshold": {
                            "fetch": {
                                "debug": "500ms",
                                "info": "500ms"
                            },
                            "query": {
                                "debug": "2s",
                                "info": "1s"
                            }
                        }
                    }
                } 

I feel like I am missing something simple to make it finally work properly and being able to handle my load. Appreciate, if any one can help me with solving the issue.