I have a two node cluster hosted in ElasticCloud.
Host Elastic Cloud
Platform Google Cloud
Region US Central 1 (Iowa)
Memory 8 GB
Storage 192 GB
SSD Yes
HA Yes
Each node has:
Allocated Processors 2
Number of processors 2
Number of indices 4*
Shards (p/ index) 5*
Number of replicas 1
Number of document 150M
Allocated Disk 150GB
* the main indices, kibana and watcher creates a bunch of small indices.
My documents are mostly text. There are some other fields (no more than 5 per index), no nested objects. Indices specs:
| Index | Avg Doc Length | # Docs | Disk |
|---------|----------------|--------|------|
| index-1 | 300 | 80M | 70GB |
| index-2 | 500 | 5M | 5GB |
| index-3 | 3000 | 2M | 10GB |
| index-4 | 2500 | 18M | 54GB |
When system is idle, response time (load time) is typically few seconds. But when I simulate the behavior of 10 users I start to get timeouts in my application. Originally timeout was 10s, I updated it to 60s and I am still having issues. Here follows a chart for simulation of 10 concurrent users using Search Api.
![](https://i.stack.imgur.com/MXtqr.png)
Red line is total request time in seconds and dotted pink line is my 60 seconds timeout. So, I'd say in most of the times my users will experience a timeout. The query I've used is quite simple:
{
"size": 500,
"from": ${FROM},
"query":{
"query_string": {
"query": "good OR bad"
}
}
}
I've tried all possible tweaks that came to my knowledge. I don't know if that is the real ES performance and I have to accept it and upgrade my plan.