We have an Elasticsearch cluster with 9 nodes with the following settings:
- Elasticsearch Version 5.1.2
- One Index in Cluster
- Primary Shard Storage Size: 3GB
- Number of Shards: 5
- Number of Replica: 3
- Node-1, Node-2 and Node-3 Master Only Nodes
- Node-4 through Node-9 Data Only Nodes
- No Parent Child Relationtip in Mappings
- Each node 24 GB of Ram, 18 Cores of CPU
- Disabled Swaped, Increased Open File Descriptor, 12 GB JVM Heap Memory
- Elasticsearch Nest Client 'Static' Adaptor and List of all Nodes IPs
As you see we have an over allocation of resources on our nodes but under stress test only one node uses all it's available search threads. As I mentioned we have 18 cores and according to default search thread limit we have (3*18/2)+1 = 28 search threads in each node.
Problems:
- Http Requests Are Not Balanced
- Other nodes don't use all their search threads. One nodes uses It's all threads and It's search queue gets large
What we have tested:
- Use one coordinator node to balance requests (no change)
How we send requests: - We use Elasticsearch as a Search Engine and a
Jmeter
is used to put stress test on search services. Test services are web services which call Some SearchTemplates using Elasticsearch Nest Client
[![Open HTTP Requests and Search Thread Pool][1]][1]
[![Cpu Usages][2]][2]
[![Queru Count and Fetch Count][3]][3]
Any idea is appreciated.
[1]: https://i.stack.imgur.com/H9qq6.png
[2]: https://i.stack.imgur.com/KYRAx.png
[3]: https://i.stack.imgur.com/n4I9m.png