I currently have a cluster with ten nodes that are also web fronts. There is one shard and all nodes are replicated using "auto_expand_replicas" to ''0-all". All searches are made locally from the front to the node which is on the same server, with "_only_local" parameter.
When used in production, everything is fine for the first 30-60 minutes. Up to 50 searches are made by second on one node. Then for an unknown reason, thread pool and channels are exploding at the same time and querys become very long (many seconds) because they're staying in queue before being executed. Here are bigdesk's graphs showing that :
I can't understand what's happening at this particular time that makes queries wait in queue. Any idea about it ?
I'm using version 1.5.0 of elasticsearch. Configuration file /etc/default/elasticsearch :
Thanks for your help